ALGORITHMIC DETERMINATION OF FLANKING DNA SEQUENCES 
THAT CONTROL THE EXPRESSION OF SETS OF GENES IN 
PROKARYOTIC, ARCHEA AND EUKARYOTIC GENOMES 



Reference to Related Application 

The present application is the subject of Provisional Application Serial No. 
60/208,650 filed June 2, 2000 entitled ALGORITHMIC DETERMINATION OF 
CONNECTRONS FOR THE HIGH LEVEL REGULATION OF GENE 
EXPRESSION, 

Introduction 

RNA introduced into a cell by a virus is now known to trigger a cellular defense 
mechanism known as post-transcriptional gene silencing (PTGS). If the viral RNA 
sequence matches a sequence within the cell's genome the associated genes are turned 
off or silenced. This phenomenon is also called 'RNA interference' or RNAi. A 
single-stranded RNA can interact with another single-stranded RNA (known as 
antisense RNA). The single-stranded RNA can also form a triple-stranded complex 
with double-stranded DNA. This triple-stranded complex is known as a Hoogsteen 
helix. This patent application shows how two specific adjacent RNA single-stranded 
sequences (called CI and C2 - for Control Sequence 1 and Control Sequence 2) 
interact with two distant double-stranded DNA sequences (called Tl and T2 - for 
Target Sequence 1 and Target Sequence 2) to form a tetradic relationship which is 
called a "connectron". The two distant DNA double-stranded sequences (Tl and T2) 
must be on the same chromosome in a genome and they must be between about Ikb 
and lOSkb of each other. The adjacent single-stranded RNA sequences (C1/C2) can 
be on the same or different chromosome as the Tl and T2 sequences. The CI 
sequence is identical to the Tl sequence and the C2 sequence is identical to the T2 



sequence. The connectron acts to stabilize the double-stranded DNA by allowing 
30nm chromatin particles to form. Genes that lie between the Tl and T2 sequences 
when wrapped up in 30nm chromatin particles are not open to promotion and 
expression. The connectron (i.e. the tetradic relationship between the T1-T2 
5 sequences and C1/C2 sequences) provides a general explanation for PTGS. A 

connectron can implemented by RNA sequences, PNA (Peptide Nucleic Acid) 
sequences or by a zinc-finger DNA Binding Protein (DBP) specific to the Tl and T2 
sequences. 

Characteristically the adjacent C1/C2 sequences lie in the 3'UTR of a gene. The Tl 
10 and T2 sequences do not lie within the translated region of any gene. These 

sequences "surround" one or more genes. There are, however, Tl and T2 sequence 
pairs that surround one or more C1/C2 sequences that are not 3'UTR to any gene. 

i:p These are called "geneless connectrons". There may be promoter sequences that 

:5 cause the transcription of these 3'UTR sequences. 

=13 1 5 A computer-based algorithm that is similar to the algorithm used in the US Patent 

Jn 6,205,404 has been developed to determine the connectron structure of any genome. 

^ This algorithm determines the existence of all the connectrons in the genomic DNA. 

if^ Connectrons exist in prokaryotes, archea, single-celled eukaryotes, multi-celled 

eukaryotes, plants and higher animals. Connectron relationships exist between 
Q 20 prokaryotes and their plasmids. The geneless connectrons provide a possible 

mechanism for forming a hierarchy of gene expression control that will produce an 

understanding of cell differentiation and tissue development. 

Each connectron is a unique tetrad of sequences. Each connectron changes the 
expression of the genes between the Tl and T2 sequences. The CI sequence (which is 
25 equivalent to the Tl sequence) and the C2 sequence (which is equivalent to the T2 

sequence) are determined by the invention described in this patent application. In 
general, the tetrad of connectron sequences can be patented because the structure of 
matter is known and the function of specific gene expression modulation is also 
known. Gene expression modification can be produced by introducing antisense 
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RNA or PNA to interact C1/C2 RNA sequences or zinc-finger DBFs to interact with 
the Tl and T2 sequences. Using connectrons it will be possible to modify cellular 
and tissue behavior in a very general manner. 

Examples will be given from different genomes to illustrate that the connectron is a 
5 perfectly general and universal concept. 

10 

Definitions 

% Double stranded DNA - Watson and Crick showed in 1953 that DNA naturally forms 

ffi a double-stranded helix. A typical double stranded sequence is 

Ul 5 ' -TAGAGGAGTACCAC-3 ' 

rj 3 ' -ATCTCCTCATGGTG-5 ' 

f ] Hydrogen Bond - The force between a hydrogen atom and another heavier atom such 

& 20 as Oxygen (O), Nitrogen (N), Phosphorus (P), or Sulfur (S). 

Positive strand - The positive strand is normally represented 5' to 3' running left to 
right as in 

25 5 '-TAGAGGAGTACCAC-3' 

Negative strand - The negative strand is normally represented 5' to 3' running right to 
left as in 

30 3 ' -ATCTCCTCATGGTG-5 ' 
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Single stranded RNA - Either the positive or the negative strand of the double- 
stranded DNA can be transcribed by the polymerase. In RNA U replaces T. 
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RNA of positive strand sequence 5 ' -UAGAGGAGUACC AC-3 ' 
RNA of negative strand sequence 5 '-GUGGUACUCCUCUA-3 ' 

Antisense RNA - The antisense strand of any RNA sequence is the compliment 
sequence 

RNA sequence 5'-UAGAGGAGUACCAC-3' 
Antisense RNA sequence 3 ' - AUCUCCUC AUGGUG-5 ' 



Triple Strand Helix - The RNA sequence of a RNA/DNA triple-strand complex is the 
15 same as the positive strand of the DNA 

DNA positive strand 5 ' -TAGAGG AGTACC AC-3 ' 

DNA negative strand 3 ' - ATCTCCTC ATGGTG-5 ' 

RNA strand 5 '-UAGAGGAGUACC AC-3 ' 
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Promoter - Any region of DNA, that binds proteins which engage the polymerase 
transcription mechanism. 

TATA Box - A region near the 3' end of a promoter with the sequence TATA. 

mRNA - The RNA produced from the DNA by the polymerase as a result of 
transcription 



Start of transcription - The 3' end of a promoter where the polymerase mechanism 
30 begins to transcribe DNA into mRNA. 

Exon - Any region of mRNA which is used to code for proteins 
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Intron - Any region of mRNA lying between two exons which is not used to code for 
proteins. The introns are edited out of the initial RNA transcript to form the mature 
mRNA. 

3' UTR - The untranslated 3' end of an mRNA is beyond the end of the last exon. A 
stop codon in the mRNA causes the ribosome to stop the translation of mRNA into 
protein. 

End of translation - The 3' end of the 3 '-most exon. 
Translated region - Any collection of exons and introns. 

Gene - Any DNA region that codes for a protein. Introns do not occur in prokaryotic 
genes and they sometime fail to occur in eukaryotic genes. A typical model of a gene 
is 

|< Promoter >| 

|<-TATA Box->| 

i<-Beginning of Translation 

1^ Translated Region >| 

End of Translation-> 

|<-Exon->|<-Intron->|<-Exon->|<-Intron->|<-Exon->|<-3 ' UTR-> 

+ strand 

- strand 

|< Gene >| 

Positive strand gene - Any gene in which the features run 5' to 3' on the positive 
strand 

Negative strand gene - Any gene in which the features run 5' to 3' on the negative 
strand 

CI sequence - Any positive or negative strand DNA sequence of 20 bases or more. 



The C2 sequence must occur in the same chromosome as the C 1 sequence. 

C2 sequence - Any positive or negative strand DNA sequence of 20 bases or more. 
The CI sequence must occur in the same chromosome as the C2 sequence. 

5 

C1/C2 - Any positive or negative strand DNA sequence of 40 or more bases such that 
the CI sequence is adjacent to the C2 sequence 

Tl sequence - Any positive or negative strand DNA sequence of 20 bases or more 
10 that is on the same chromosome as the T2 sequence. The Tl and T2 sequences must 

be between about Ikb and 105kb apart. 

T2 sequence - Any positive or negative strand DNA sequence of 20 bases or more 
that is on the same chromosome as the Tl sequence. The T2 and Tl sequences must 
15 be between about Ikb and 105kb apart. 

Last exon gap or Gap-Distance - The number of bases between the end of 
transcription and the beginning of the C1/C2 sequence. In prokaryotes and single- 
celled eukaryotes this gap can range from no bases to 500 bases. In multi-celled 
20 eukaryotes the gap can be as large as 10,000 bases. 

Poly-adenylation signal - A number of Adenosine (A) bases are added to the mRNA 
at the end oftheS'UTR. 

15 Possible Connectron - Any set of Tl, T2 and C1/C2 sequences such that the CI 

sequence is identical to the Tl sequence and the C2 sequence is identical to the T2 
sequence. The promoter of some gene causes the mRNA of the gene to be expressed. 
The mRNA is edited to eliminate the introns. The whole mRNA including the 3'UTR 
can move about in the cell or the nucleus of the cell. The C1/C2 RNA that is part of 

50 the 3'UTR moves to the Tl and T2 DNA sequences. A triple-stranded complex of 

the DNA and the RNA forms such that the C 1 sequence forms hydrogen bonds with 
the Tl sequence and the C2 sequence forms hydrogen bonds with the T2 sequence. 
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Because the CI sequence is adjacent to the C2 sequence, the Tl sequence is brought 
physically close to the T2 sequence. This produces a loop of between about Ikb and 
105kb in the DNA. Histone proteins reduce the length of the DNA by binding 200 
bases. Histone/DNA complexes form six-fold symmetry chromatin assemblies. The 
diameter of the chromatin assemblies is approximately 30nm. 

Real Connectron - Any Possible Connectron which is within the Gap-Distance of 
some gene 

Homologous connectron - The Tl sequence and the T2 sequence are on the same 
chromosome as the C1/C2 sequence 

Heterologous connectron - The Tl sequence and the T2 sequence are on a 
chromosome different from chromosome of the C1/C2 sequence 

Permanent connectron - Any C1/C2 sequence, which is 3' UTR to some gene that is 
not surrounded by any Tl and T2 sequence pairs 

Transient connectron - Any C1/C2 sequence, which is 3' UTR to some gene that is 
surrounded by one or more Tl and T2 sequence pairs 

Self-limiting connectron - Any C1/C2 sequence which is 3 'UTR to some gene that is 
surrounded by the Tl and T2 sequences such that C1=T1 and C2=T2 

Geneless connectron - Any C1/C2 sequence which is not 3 'UTR to some gene but is 
surrounded by some Tl and T2, A promoter may lie 5' to the C1/C2 sequence. 

Bidirectionality of Connectron Excitation ~- A C1/C2 short loop on one strand selects 
a T1-T2 long loop pair on the same or the opposite strand. The C1/C2 short loop has 
a complementary Cl'/C2' sequence on the opposite strand. Similarly the T1-T2 long 
loop pair has a complementary long loop pair Tl'-T2'. Wherever a C1/C2, T1-T2 
tetrad exists there is a complementary Cl'/C2', Tl '-T2' tetrad. The C1/C2 short loop 



can be transcribed as a 3'UTR to a gene on the same strand. The CI VC2' short loop 
which is on the strand opposite to the C1/C2 short loop can also can be transcribed as 
a 3'UTR to a gene on the same strand. There are four possible models of action 



Tl T2 gene-Cl/C2 

+ strand 

- strand 



Tl T2 
+ strand 



- strand 

10 C2/C1 - gene 



+ strand 

- strand — 

!| 15 T2' Tr C2VCl'-gene 

IS gene-ClVC2' 

IJl + strand 

'L, - strand 

IJl 20 T2' Tl' 

^ Of course, the short loops and the long loops do not have to be on the same 

chromosome. 

25 Hierarchy of connectron action - When a C1/C2 is expressed it forms a T1-T2 loop 

by forming a connectron. The C1/C2 sequence does not have to be on the same 
chromosome as the Tl and T2 sequences. This provides a way of causing interaction 
between chromosomes. When the T1-T2 loop forms, any genes in that loop region 
which had been expressing C1/C2 sequences in their 3'UTRs, now cease expressing 

30 the C1/C2 sequences. The connectrons formed by these C1/C2 sequences will cease 

to exist after some time thus opening up the genes inside the respective T1-T2 loops 
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to expression. The hierarchy of connectron action is alternates between repression 
and expression. The connectron hierarchies can be of any depth. 

One-to-Many connectron action - One C1/C2 sequence can form connectrons in 
many different places on many different chromosomes. The only requirement is that 
C1=T1 and C2==T2, This makes it possible for one expression event to control the 
expression of many genes on different chromosomes. 

Many-to-One connectron action - Cl/C2s that come from many different places on 
many different chromosomes can form a connectron for a specific T1-T2 sequence 
pair. The only requirement is that C1=T1 and C2=T2. This makes it possible for 
many different expression events to control the expression of one set of genes on a 
particular chromosome. 

Many-to-Many connectron action - The arrangement of Cl/C2s and Tl-T2s across 
chromosomes can form a complex web of gene expression control relationships. 

Percentage of the Genome Regulated by Connectrons - Since the connectrons for a 
sequenced genome can be calculated, the percentage of the genome that is open to 
connectron regulation can be known. 

Emergent Property ~ The network of connectrons in any genome emerges from a 
knowledge of the complete DNA sequence of the genome. Because both the C1/C2 
sequences and the T1-T2 sequences can be any place in the genome, the whole 
genomic sequence must be known before all the connectrons can be determined. 

Paradigm Shift - For the past fifty years since the discovery by Watson and Crick of 
the double-helical nature of DNA, the reigning paradigm for scientific discovery has 
been the study of one gene and its effects on the behavior of a cell. The advent of 
genomic sequencing and this invention of connectrons that emerge from the whole 
genome will produce a shift in the way scientists view biological systems and the way 
they formulate and execute experiments. The many-to-many relationships between 



- 10- 



the connectrons means that there are many ways in which the expression of a set of 
genes can be modulated. The muhiplicity of control pathways means produces a 
system stability that makes it possible for biological systems to be stable for long 
periods of evolutionary time. The thinking that goes into formulating scientific 
experiments will have to change to accommodate the changes in understanding that 
will be induced by the application and extension of this patent application. 

Hierarchy of DNA Structuring - The DNA of a cell's genome is structured in a 
hierarchy of six levels. Figures 1, 2 and 3 have been adapted from The Molecular 
Biology of the Cell by Alberts, Bray, Lewis, Raff, Roberts and Watson [third edition 
pages 354, 345 and 348]. As shown in figure 1, the double stranded DNA is level 1, 
The double-stranded DNA is wrapped around histone proteins to form a chromatin 
particle that is level 2 of the hierarchy. Level 2 is described as "beads-on-a-string" in 
figure L The chromatin particles are packed in a six-fold symmetry as shown in 
figure 2a and figure 2b. These six-fold assemblies have a diameter of 30 nm. Each 
30 nm assembly contains from 18 (i.e. 6 * 3) to 30 (i.e. 6 * 5) chromatin particles. 
The 30 nm assemblies aggregate into large loops which range in length from 5,000 
bases to 100,000 bases of DNA. The size of these large loops as shown in figure 1 is 
approximately 300 nm. These large loops constitute level 4 of the structuring 
hierarchy. As shown in figure 1, level 5 of the DNA structuring hierarchy many large 
loops are condensed to form a structure which is approximately 700 nm in diameter. 
The complete chromosome that constitutes level 6 of the hierarchy is composed of 
two very long sections of level 5 DNA. 

Model of Chromatin Structure - The level 4 structure of DNA as shown in figure 1 
ranges in length from 5,000 to 105,000 bases of DNA. Figure 3 shows that proteins 
are thought to connect portions of the long loops formed by the 30 nm particles to 
form a chromosome axis. These condensed long loops are described as chromomeres 
in The Molecular Biology of the Cell. 



Prior Art 



The chromomere model of DNA structuring was presented by N. A Resnik, et al.[l] 
and is based on electron microscopic data. There are more recent papers studying a 
variety of genomes with electron microscopy but no equivalent study of chromomeres 
has been done on a fully sequenced genome. 

A recent News Feature in Nature by T. Gura [2] described the discovery of post- 
transcriptional gene silencing in which viral RNA interacts with the transcribed RNA 
of the cell to silence the expression of genes. This article describes experiments in C. 
elegans and D. megalomaster in which RNA that is complementary to mRNA 
introduced into a cell. This "antisense" RNA has the effect of turning off the 
expression of one or more genes. The introduced complementary RNA produces an 
"RNA interference" called RNAi. 

Thomas Werner and his colleagues at Genomatix in Munich, Germany have 
developed an approach to understanding what they call "Matrix Attachment Region" 
(MAR). Figure 5 shows their interpretation of the structure of DNA surrounding a 
gene. The following description of the MAR is copied from the Genomatrix web site 

"Matrix Attachment Regions (MARs) MARs are sequence regions that are 
responsible for the attachment of genomic DNA to the nuclear matrix or scaffold. 
Transcription absolutely requires anchorage of genomic DNA to the nuclear matrix. 

Functional features of MARs: 

Anchoring of regulatory elements like promoters and enhancers to the nuclear 
matrix. 

Ensuring long term activity of promoters and enhancers in chromatin. 
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Insulation, rendering a functional domain insensitive to position effects. 

Genomatix is conducting a research project to define and detect MARs by computer- 
analysis." 
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Brief Description of the Objects of the Invention 

An object of the invention is to provide a method of identifying DNA sequences that 
control the expression of different collections of genes in a genome comprising, 
detecting selected DNA sequences adjacent to some genes excluding exons and 
introns. 

An object of the invention is to provide a method of identifying DNA sequences that 
control the expression of different collections of genes comprising, detecting, by 
computer, one or more pairs of non-adjacent DNA sequences to which are bound to 
two RNA sequences. 

An object of the invention is to provide a method of identifying DNA sequences that 
control the expression of different collections of genes in a genome comprising 
detecting changes in connectron behavior in the genome. 

An object of the invention is to provide a method of modifying the expression of 
different gene collections in a genome, comprising detecting changes in connectron 
behavior as a result of an exogenous stimulus. 

An object of the invention is to provide a method of detecting where and when new 
genes are being integrated into a host genome comprising detecting the connectrons in 
said host genome. 

An object of the invention is to provide a method of detecting the expression effect of 
different gene collections in a given body comprising detecting the back and forth 
flow of connectrons between the chromosomes thereof 
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An object of the invention is to provide a method of modifying a given body 
comprising modifying the connectron organization therein. 

An object of the invention is to provide a method of detecting connectron control and 
target sequences in a given genome comprising: 

determining the base composition of said genome, 

determining one or more sites of control sequence organization, and/or 

determining one or more sites of target application. 

An object of the invention is to provide a method of determining the response of a cell 
in any tissue to changes in the cell's environment and/or genetic composition 
comprising providing a complete genomic DNA sequence for the organism and 
determining the effect of changes in connectrons due to application of a given 
exogenous stimulus to the gnome. 

An object of the invention is to provide a method of determining in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes, the tetradic relationship 
T1=C1 and T2-C2 where Tl and T2 are DNA sequences 20 or more bases in length, 
where the CI sequence is adjacent to the C2 sequence, where the Tl and T2 
sequences are on the same chromosome, and where the C1/C2 sequences are on the 
same chromosome as Tl and T2 or where the CI/C2 sequences are on a chromosome 
different from Tl and T2, wherein: 

CI sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the C2 sequence must occur in the same chromosome as the CI 
sequence, 

C2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the CI sequence must occur in the same chromosome as the C2 
sequence, 
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C1/C2 - any positive or negative strand DNA sequence of 40 or more bases 
such that the CI sequence is adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the T2 sequence, the Tl and T2 
sequences must be between about Ikb and 105kb apart, and 

T2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about Ikb and 105kb apart. 

An object of the invention is to provide a method of determining in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes, the connectron 
relationship that permits many different C1/C2 short loops to control the existence of 
a T1-T2 long loop and wherein said C1/C2 short lops can be on the same 
chromosome or on different chromosomes from the T1-T2 long loop, wherein: 

C 1 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the C2 sequence must occur in the same chromosome as the CI 
sequence, 

C2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the CI sequence must occur in the same chromosome as the C2 
sequence, 

C1/C2 - any positive or negative strand DNA sequence of 40 or more bases 
such that the CI sequence is adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the T2 sequence, the Tl and T2 
sequences must be between about Ikb and 105kb apart, and 



- 16- 



T2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about Ikb and 105kb apart. 

An object of the invention is to provide a method of determining in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes, the connectron 
relationship that permits one C1/C2 short loop to control the existence of many T1-T2 
long loops, the C1/C2 short loop can be on the same chromosome or on different 
chromosomes from the T1-T2 long loops, wherein: 

C 1 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the C2 sequence must occur in the same chromosome as the CI 
sequence, 

C2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the CI sequence must occur in the same chromosome as the C2 
sequence, 

C1/C2 - any positive or negative strand DNA sequence of 40 or more bases 
such that the CI sequence is adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the T2 sequence, the Tl and T2 
sequences must be between about Ikb and 105kb apart, and 

T2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about Ikb and 105kb apart. 

An object of the invention is to provide a method of determining in the connectron 
relationships between prokaryotes and their plasmids wherein said connectrons 
implement a control mechanism between the two genomes that makes it possible from 
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them to form a symbiotic relationship, and in the case of D. radiodurans the 
relationship is not symmetric, and the D. radiodurans genome sends C1/C2 short 
loops to the MPl plasmid, wherein: 

C 1 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the C2 sequence must occur in the same chromosome as the CI 
sequence, 

C2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the CI sequence must occur in the same chromosome as the C2 
sequence, 

C1/C2 - any positive or negative strand DNA sequence of 40 or more bases 
such that the CI sequence is adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the T2 sequence, the Tl and T2 
sequences must be between about Ikb and 105kb apart, and 

T2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about Ikb and 105kb apart. 

An object of the invention is to provide a method of determining that connectron 
relationships that exist in plant and higher animals. 

An object of the invention is to provide a method of determining in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes, the connectron 
relationship that permits one C1/C2 short loop to control the existence of one or more 
T1-T2 long loops without being subject to any expression controls other than those of 
the gene to which the C1/C2 is 3'UTR, wherein: 
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CI sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the C2 sequence must occur in the same chromosome as the CI 
sequence, 

5 C2 sequence - any positive or negative strand DNA sequence of 20 bases or 

more, the CI sequence must occur in the same chromosome as the C2 
sequence, 

C1/C2 - any positive or negative strand DNA sequence of 540 or more bases 
10 such that the CI sequence is adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the T2 sequence, the Tl and T2 
sequences must be between about Ikb and 105kb apart, 

T2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about Ikb and 105kb apart, and 

3*UTR - untranslated 3' end of an mRNA is beyond the end of the last exon, a 
stop codon in the mRNA causes the ribosome to stop the translation of mRNA 
into protein. 

An object of the invention is to provide a method of determining in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes, the connectron 
relationship that permits one C1/C2 short loop to control the existence of one or more 
T1-T2 long loops such that this C1/C2 short loop is itself subject to expression control 
by another T1-T2 long loop which surrounds it, wherein: 

C 1 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the C2 sequence must occur in the same chromosome as the CI 
sequence, 
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C2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the CI sequence must occur in the same chromosome as the C2 
sequence, 

C1/C2 - any positive or negative strand DNA sequence of 540 or more bases 
such that the CI sequence is adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the T2 sequence, the Tl and T2 
sequences must be between about Ikb and 105kb apart, and 

T2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about Ikb and 105kb apart. 

An object of the invention is to provide a method of determining in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes, the connectron 
relationship that permits one C1/C2 short loop to control the existence of the T1-T2 
long loop that surrounds it, wherein: 

CI sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the C2 sequence must occur in the same chromosome as the CI 
sequence, 

C2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the CI sequence must occur in the same chromosome as the C2 
sequence, 

C1/C2 - any positive or negative strand DNA sequence of 40 or more bases 
such that the CI sequence is adjacent to the C2 sequence. 
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Tl sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the T2 sequence, the Tl and T2 
sequences must be between about Ikb and 105kb apart, and 

T2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about Ikb and 105kb apart. 

An object of the invention is to provide a method of determining the connectron 
relationships that do not have any genes within the T1-T2 long loop, wherein: 

Tl sequence is any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the T2 sequence, and 

T2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the Tl sequence, and the T2 or Tl 
sequences must be between about Ikb and 105kb apart. 

An object of the invention is to provide a method of determining the geneless 
connectron relationship where one C1/C2 short loop controls the existence of many 
geneless T1-T2 long loops, wherein: 

CI sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the C2 sequence must occur in the same chromosome as the CI 
sequence, 

C2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more, the CI sequence must occur in the same chromosome as the C2 
sequence, 

C1/C2 - any positive or negative strand DNA sequence of 40 or more bases 
such that the CI sequence is adjacent to the C2 sequence, 
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Tl sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the T2 sequence, the Tl and T2 
sequences must be between about Ikb and 105kb apart, and 

T2 sequence - any positive or negative strand DNA sequence of 20 bases or 
more that is on the same chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about Ikb and 105kb apart. 
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Description of the Drawings and Tables 

The above and other objects, advantages and features of the invention will become 
more apparent when considered with the following specification and accompanying 
drawings and tables wherein: 

Figure 1 DNA is structured in six levels of increasing condensation. Double 

stranded DNA is level 1 . Two turns of DNA are wrapped about each 
chromatin particle at level 2. The chromatin particles which each 
containing 200 base pairs form into 30 nm particles at level 3. The 30 
nm particles form into large loops with an approximate dimension of 
300 nm at level 4. Metaphase chromosomes form a condensed 
structure with an approximate dimension of 700 nm at level 5, An 
entire metaphase chromosome has a width of approximately 1400 nm 
at level 6. The large loops at level 4 of the DNA structuring are 
thought to have between 20,000 (20 kb) and 100,000 (100 kb) base 
pairs. 

The Molecular Biology of the Cell by Alberts, Bray, Lewis, Raff, 
Roberts and Watson, 3rd. ed. , Garland Publishing, Inc., New York, 
1994, p. 354 

Figure 2 (a) Chromatin DNA forms into a six-fold symmetry 30nm particles. 

(b) The six-fold symmetry 30nm particles form a linear chain with a 
varying number of repeat units. 

The Molecular Biology of the Cell by Alberts, Bray, Lewis, Raff, 
Roberts and Watson , 3rd. ed. , Garland Publishing, Inc., New York, 
1994, p, 345 
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Figure 3 



Long loops of 30nm particles are thought to be closed at the bottom of 
the loop by proteins. 



Figure 4 



The Molecular Biology of the Cell by Alberts, Bray, Lewis, Raff, 
Roberts and Watson, 3rd. ed. , Garland Publishing, Inc., New York, 
1994, p. 348 

(a) Transcription and Editing, (b) Movement of the RNA through the 
Nucleus, (c) Connectron Formation 



10 



Figure 5 



Overview of schematic organization of a typical transcriptionally 
active chromosomal loop. 

From http://genomatix.gsf.de/func_genomics/ 
functional_genomics.html 



15 



Table 1 Connectron Properties for Prokaryotic, Archea and Eukaryotic 

Genomes 



Table 2 



Yeast Inter-Chromosomal Connectron Distribution 
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Figure 6 Genome size plotted as a log-log function of the Number of 

Connectrons 

Figure 7 Number of Sequence Instances plotted as a function of the Number of 

Fragments 

Figure 8 Level 0 - The overall view of the algorithm 
Figure 9 Level 1 - Process Flovv^ of the Algorithm 

Figure 10 Level 2a - two pages - Process Genome into Blocking Fragment File 

Figure 1 1 Level 2b - two pages - Compute the Connectrons for a Genome 

Figure 12 Level 2c - two pages - Analyze Possible Connectrons 

Figure 13 Level 3a - Setup Genome Usage Memory 

Figure 14 Level 3b - Find DBP-Size Blocking File for Tl-Window 

Figure 1 5 Level 1 - Find DBP-Size Blocking File for T2-Window 

Figure 1 6 Level 2a - two pages - Find C 1 /C2 Entries 

Figure 17 Level 2b - two pages - Scan Genome Usage Memory for Potential 

Connectrons 
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Description of the Invention 



A connectron is a relationship among four DNA sequences. Each sequence must be 
at least 20 bases long. There is a report by Sharp and Zamore [3] that RNA sequences 
of "about length 25" are important as sources of RNAi. 27 bases were actually used 
as the minimum length of each of the sequences. The Tl sequence is on one strand of 
some chromosome in a genome. The T2 sequence is on the same strand of the same 
chromosome as the Tl sequence. The Tl and T2 sequences (which are each at least 
20 bases in length) must be at least 5,000 bases distant from each other but they can 
not be more than 105,000 bases distant from each other. The CI sequence and the C2 
sequence (which are each at least 20 bases in length) are adjacent to each other on 
some strand of some chromosome in the genome. The C1/C2 sequences - called the 
"short loop" - can be on the same strand as the Tl and T2 sequences or they can be 
on the opposite strand. The C1/C2 sequences of the short loop can be on the same 
chromosome as the Tl and T2 sequences but they can also be on a different 
chromosome in the genome. When a genome has only one chromosome, then the 
point is moot. Many genomes, of course, have several chromosomes. The CI 
sequence is identical to the Tl sequence and the C2 sequence is identical to the T2 
sequence. 

The C1/C2 sequence must be on the same strand as a gene, either be directly adjacent 
to the gene (i.e. a gap of 0 bases) for prokaryotic genomes or at this time be within 
10,000 bases for eukaryotic genomes. The size of the gap between the end of the 
gene and the beginning of the C1/C2 sequence is a variable. The C1/C2 short loop is 
expressed as the 3'UTR (Un-Translated Region) of the gene. In the case of 
prokaryotic genes that do not normally have introns, the whole mRNA becomes the 
active species for connectron formation. In the case of eukaryotic genes, the whole 
transcript is the active species for connectron formation upon editing of the transcript 
to eliminate the introns. The whole transcript then can move about in the cytoplasm 
of prokaryotic cells or the nucleus of eukaryotic cells. Since the CI sequence is 
equivalent to the Tl sequence and the C2 sequence is equivalent to the T2 sequence. 
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the CI RNA can form a Hoogsteen triple-stranded RNA/DNA/DNA helix with the 
double-stranded Tl sequence. Similarly the C2 RNA can form a Hoogsteen triple- 
stranded RNA/DNA/DNA helix with the double-stranded T2 sequence. Because the 
CI sequence and the C2 sequence are adjacent to each other, the C1/T2 
RNA/DNA/DNA Hoogsteen triple helix is brought into physical adjacency to the 
C2/T2 RNA/DNA/DNA Hoogsteen triple helix. RNA/DNA/DNA hybrid helices are 
the most stable form of triple helix. RNA double helices, DNA double helices, RNA 
triple helices and DNA triple helices are all significantly less stable than a 
RNA/double-stranded DNA triple helix. The stable physical adjacency of the two 
triple-stranded Hoogsteen helices ensures that the long loop of double-stranded DNA 
between the Tl sequence and the T2 sequence can then be structured into 30 nm 
chromatin particles as shown in level 4 of figure L The genes on either strand of the 
DNA between the Tl sequence and the T2 sequence when they are structured into the 
30 nm chromatin particles are not open to promotion and expression. 

The tetradic relationship between the Tl and T2 sequences that form the long loop 
and the C1/C2 sequences that form the short loop are called a connectron. The name 
"connectron" was suggested by J. David Rawn Ph.D. of Towson University. A 
connectron is possible if the Tl, T2, CI and C2 sequences exist. A connectron is real 
if the C1/C2 short loop sequence is adjacent to an expressible gene. If the expression 
of the adjacent gene is inside one or more Tl - T2 long loops then this connectron is 
said to be transient. If the adjacent gene is not inside any possible T1-T2 long loop 
then the connectron is said to be permanent. If a connectron is inside of a T1-T2 long 
loop that has the same sequences (i.e. Tl is really equal to CI and T2 is really equal 
to C2) then the connectron is said to be self-limiting. This is true because once the 
C1/C2 sequence is expressed it forms the T1-T2 long loop that then shuts off the 
expression of the gene adjacent to the C1/C2 sequence. Self-limiting conectrons can 
also be called "spike" connectrons since they generate a short-duration spike of the 
C1/C2 short loop sequence. If a T1-T2 long loop does not contain any genes but it 
contains C1/C2 short loop sequences then this type of connectrons is said to be 
geneless. The C1/C2 short loops within a geneless T1-T2 long loop can, of course, 
control the expression of genes. 
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The physical existence and lifetimes of the connectrons must be proved by molecular 
biological experimentation. This physical experimental process, however, is logically 
quite separate from the computational experimentation that have been conducted 
from June of 1999 to May of 2001. The computational search for the existence of 
connectrons has been extremely positive. These computations have shown that 
connectrons exist in prokaryotes, in archea, between prokaryotes and their plasmids, 
in single-celled eukaryotes, in multi-celled eukaryotes, in plants, in higher animals 
and in humans. All of these features and properties are described in the claims 
section that follows. 

The connectron invention is very powerful. It depends only on sequence equivalency. 
The minimum length of the four sequences seems to be about 20 bases. In the 
calculations shown in this patent application, 27 bases have been used as a minimum. 
The Nature News Feature [1] says that other scientists have found RNA sequences of 
length about 25 that have interesting gene silencing properties. The Nature article 
does not give any mechanism. Because of my algorithm and its use on a variety of 
genomes, this patent application provides the computational proof that a particular 
mechanism is highly probable. The connectron invention provides an explanation for 
how communication occurs with a chromosome as well as between chromosomes in 
genomes that have more than one chromosome. Since each T1-T2 long loop can 
contain one or more genes, the connectron invention provides a mechanism for 
turning on and turning off sets of genes simultaneously. In time, the connectron 
invention will provide an explanation for how differentiation of how one cell's 
behavior differs from the behavior of another adjacent cell. It is already clear from 
the computational experiments that have been made on S. cervesiae, C. elegans and 
D. megalomaster that the number of geneless connectrons increases dramatically as 
evolution proceeds from single-celled eukaryotes (i.e. S. cervesiae) to 1,000 cell 
eukaryotes (i.e. C. elegans) to visible creatures (i.e. D. megalomaster). The extension 
of this evolutionary progress to plants (i.e. A. thaliania) for which only three 
chromosomes are sequenced and humans (i.e. H. sapiens) for which only one 
chromosome is completely sequenced. Although the complete human genome was 
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published in Nature and Science in February of 2001, the NIH-sponsored genomic 
sequencing resuhs are available for about 1/3 of the bases in the whole genome. The 
human genomic sequence determined by Celera Genomics, Inc. is available only by 
subscription. Table 1 shows how the genome size, the number of genes, the number of 
gene-containing and geneless connectrons and the percentage of genes controlled are 
related in many different genomes. 

The C1/C2 short loops originate on one chromosome. The T1-T2 long loops can be 
on the same or different chromosomes. Table 2 which is for yeast (S. cervesiae) is a 
square matrix of how many C1/C2 short loops on a given chromosome are sent to 
form T1-T2 long loops on other chromosomes. The diagonal of this matrix shows 
that many chromosomes send connectrons to themselves. The striking feature of this 
particular table is that chromosome 6 only sends connectrons to chromosome 12 but 
that it receives connectrons from chromosomes 4,5,7,10,12,13,15 and 16. 

Any tetrad of connectron sequences (i.e. the Tl, T2, CI and C2 sequences) as well as 
the fact of the adjacency of the C1/C2 short loop sequence to the transcribing gene 
can be patented because the content of matter and the utility can be exactly described. 
The utility of a connectron is that the T1-T2 long loop shuts off the expression of the 
genes that lie between the Tl sequence and the T2 sequence. In the case of geneless 
connectrons, the utility is of a higher level in that the C1/C2 short loops contained in 
the higher-level geneless T1-T2 long loop, eventually form other lower-level T1-T2 
long loops around a set of genes. 

The invention of connectrons comes at a particularly important time in biological 
discovery. The geneless connectrons make a many-to-many hierarchical control 
mechanism possible. It is already clear from the determination of the conectrons for 
C. elegans and D, megalomaster that there are as many or more geneless connectrons 
than there are genes. It has been clear for some time that the number of genes in a 
genome is not particularly correlated with the size of the genome. Figure 6 shows 
that the size of a genome is roughly linearly correlated with the number of 
connectrons. 
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The connectron invention can be used to generate a model of behavior in any cell 
The simulation of connectron behavior in different genomes will be the subject of 
another patent application. 

The connectron invention provides for a rational exploitation of the information 
contained in the raw genomic DNA sequence by forming a hierarchy of relationships 
between geneless connectrons, transient connectrons, permanent connectrons, self- 
limiting connectrons and the expression of genes. 
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Detailed Description of the Invention 



The algorithm for the determination of connectrons in any genome or any genome 
fragment is represented in the following flow diagrams. The Level 0 diagram in 
figure 8 shows the general relationships in a digital computer. The central processor 
of the digital computer uses the computer program to take genome descriptors, the 
genomic DNA sequences and the tables of gene features to produce a file of blocking 
fragments and a file of the optimal connectrons for the genome. The printer serves to 
make hard copies of the files and this patent application. The level 1 diagram in 
figure 9 shows the three essential steps in the determination of connectrons. The 
genome is first processed into a blocking fragment file. Then the blocking fragments 
are used to compute the connectrons for the genome. Finally the potential 
connectrons are analyzed to determine if the C1/C2 sequences are in the 3'UTR of a 
gene. The level 2a diagram in figure 10 shows the steps required for the processing of 
the genome into a file of blocking fragments. The genomic DNA sequence is 
decomposed into 27-base fi-ames for both the positive and negative strands. These 
fragments are written to the unsorted fragment file. The fragment file is then sorted is 
then read and formed into groups of equivalent sequences. The (.blk) file contains the 
sequence and a pointer to the (.gptr) file which contains the pointers to the position of 
the fragments in the genomes. The position in the genome includes the chromosome 
number, the position in the chromosome and the strand (i.e. positive and negative). A 
sample of these files follows 

Sample of the (.blk) file for S. cervesiae 



27-base fragment Number Pointer 

of instances to (.gptr) file 



1111111111111111111111111 


0 


1 


nil 123244233313332443414 


1 


2 


1111141113443133314333341 


2 


4 


1 1 1 1232442333133324434141 


1 


5 


1 1 1 1 3233 1 1 1 33323 144423444 


2 


7 


1111332213331341414443413 


2 


9 


1 1 1 1333444112343412323243 


1 


10 
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1 1 1 1 13334441 13343412323243 


Q 


1Q 


11111411 134431333143333414 


2 


21 


1 1 1 11 443223 1 34 1 42 1 24434 1 24 


2 


23 


1 1 1 12223234344444443 144442 


2 


25 


1 1 1 12244123441 VllTA^MYiyh 

X i X X ^ji* 1 1 X ^^T^T^ X X X T X ^ X ^ 


8 


33 


1 1 112311241 1 14344334134431 

X X X X^^X Xi^^X X X ~J~ 1 ~J~~ ^ X 


2 


35 


1111 7 1 ^'X'X'^AA'XA \ AAA 




JO 


111 \2344m23\3442A2234342 




37 


11 11 243344424442 1 1 44 1 3 42 1 1 




38 


1 1 1 1 24443 1131 3442332 1 42224 




39 


[11113131241131114424413231 




40 


.111 1314333234431 1 1 1313341 1 




41 


111 13233 11 1333231444234441 


2 


43 



15 In fragments above 1=G, 2=C, 3=A, 4=T 

Sample of the (.gptr) file for S. cervesiae 
There are 16 chromosomes in S. cervesiae 

O 20 

Item Chromosome Position Direction 
III in Chromosome 





1 


0 


0 


0 


25 


2 


4 


11137 


1 




3 


12 


467619 


1 




4 


12 


458482 


1 




5 


4 


11138 


1 




6 


12 


465759 


2 


30 


7 


12 


456622 


1 




8 


1 


219366 


1 




9 


8 


539978 


1 




10 


14 


522451 


1 




11 


4 


1099073 


1 


35 


12 


4 


1210003 


1 




13 


7 


539068 


1 




14 


12 


654136 


1 




15 


12 


596455 


1 




16 


15 


121016 


1 


40 


17 


15 


598127 


2 




IS 


16 


847724 


1 




19 


16 


59765 


1 




20 


12 


467620 


1 




21 


12 


458483 


1 


45 


22 


12 


461657 


1 




23 


12 


452520 


1 




24 


13 


838006 


1 
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25 


15 


288270 


1 


26 


4 


83593 


1 


27 


4 


992867 


1 


28 


6 


162265 


1 


29 


7 


845687 


1 


30 


10 


531560 


2 


31 


15 


282208 


1 


32 


16 


860418 


1 


33 


16 


572308 


1 


34 


12 


465992 


1 


35 


12 


456855 


-* 

1 


36 


4 


11139 


1 


3/ 


8 


89343 


1 


38 


4 


10302 


1 


39 


1 


19894 


2 


40 


16 


9311 


1 


41 


10 


735203 


1 


42 


12 


465760 


1 


43 


12 


456623 


1 



In direction column above l=positive strand, 2=negative strand 

The level 2b diagram in figure 1 1 show^s the computation of the connectrons. The 
genome descriptors consist of the number and length of the chromosomes. The 
algorithm uses an array that represents several facts about each base position in the 
genome. The level 3a diagram in figure 13 shows the setup of the Genome-Usage 
memory. The gene features are used to prevent the region of the genome that codes 
for proteins from being used for the connectron sequences (i.e. the Tls, the T2s, the 
C Is and the C2s). In the level 2a diagram of figure 10, the algorithm steps through 
each chromosome and within each chromosome through each base position looking 
for acceptable Tl -windows of 27 bases. A Tl-window can be used to form a 
connectron relationship if there are two or more instances of this fragment in the 
blocking fragment file. The computation in the level 3b diagram of figure 14 
determines if the Tl-window is acceptable of not. Once an acceptable Tl-window is 
found, the algorithm (in the level 2a diagram of figure 10) looks for acceptable T2- 
window positions that lie between 5,000 and 105,000 bases from the Tl-window. 
The computation for determining acceptable T2-window positions is done in the level 
3c diagram of figure 15. Once a pair of Tl and T2 window positions are found, the 
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algorithm looks among the instances of these Tl and T2 sequences for a pair of 
sequences CI and C2 that lie within 200 bases of each other on the same 
chromosome. The computation for determining acceptable C1/C2 windows is shown 
in the level 3d diagram in figure 16. In the level 3e diagram of figure 17 the Genome- 
Usage memory is scanned for the Possible-Connectrons. In the level 2c diagram of 
figure 12 the Possible-Connectrons are scanned to determine if the C1/C2 sequences 
are within the Gap-Distance of a gene on either the positive or the negative strands. 
The Real-Connectrons are then written out in several different files including the 
descriptions in the claims section. 
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Examples 



The algorithm for the determination of optimal connectrons has been applied to a 
number of different publicly available genomes. The connectron is a tetradic 
relationship between four sequence elements - Tl, T2, CI and C2. The claims 
presented in this section are written by the program NearGene that implements the 
flow diagram Level 2c of figure 12. The examples are written a uniform type of 
English. Each example contains some or all of the following elements 



Name of genome 
Description of Tl 
Length of T1-T2 loop 

The chromosome on which the T1-T2 loop exists 

The identifier number within the genome of the Tl sequence 

The Tl sequence 

Description of T2 

The identifier number within the genome of the T2 sequence 
The T2 sequence 

A list of genes whose expression is controlled by the T1-T2 loop 

The common names of the genes as obtained from the NCBI gene feature file 

(.ptt) 

A list of C1/C2 short loops whose expression if controlled by the T1-T2 loop 
The chromosome on which the C1/C2 short loop exists 

The common name of the gene which expresses the C1/C2 short loop as an 
RNA 

The sequence of the C1/C2 short loop 

A list of C1/C2 short loops that control the formation of the T1-T2 loop 
The chromosome on which the C1/C2 short loop exists 

The common name of the gene which expresses the C1/C2 short loop as an 
RNA 

The sequence of the C1/C2 short loop 



-35 - 



The match between the C1/C2 sequence and the Tl sequence 
The match between the C1/C2 sequence and the T2 sequence 



The uniform descriptions make it possible to rapidly comprehend the specifics in each 
example. 

When a sequence element is very long a series of four dots has been inserted between 
the beginning and ending sequence groups. A variable number of bases have been 
deleted. 
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Index of Pages for Connectron Samples 
Page 39 

Connectrons occur in prokaryotes, archea, single-celled eukaryotes and multi- 
celled eukaryotes. 

Page 57 

Many Connectrons control the expression of one set of genes in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes. 

Page 83 

One connectron controls the expression of many sets of genes in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes. 

Page 107 

Connectrons occur between prokaryotes and their plasmids. 
Page 117 

Connectrons occur in plants and higher animals 
Page 126 

Permanent connectrons exist in prokaryotes, archea, 
single-celled eukaryotes and multi-celled eukaryotes. 

Page 135 

Transient connectrons exist in prokaryotes, archea, 
single-celled eukaryotes and multi-celled eukaryotes. 

Page 152 

Self-limiting connectrons occur in ppokaryotes, archea, single-celled 
eukaryotes and multi-celled eukaryotes 
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Page 164 

Geneless connectrons exist in single-celled and 
multi-celled eukaryotes 

Page 174 

One connectron controls many geneless connectrons 
in single-celled and multi-celled eukaryotes 
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1. Connectrons occur in prokaryotes, archea, single-celled eukaryotes and multi- 
celled eukaryotes. 

Connectrons exist as tetradic relationships where the sequence Tl is equivalent to the 
sequence CI (written T1=C1) and where the sequence T2 equals the sequence C2 
(written T2=C2) where Tl and T2 are DNA sequences 20 or more bases in length, 
where the CI sequence is adjacent to the C2 sequence, where the Tl and T2 
sequences are on the same chromosome, and where the C1/C2 sequences are on the 
same chromosome as Tl and T2 or where the C1/C2 sequences are on a chromosome 
different from Tl and T2, The connectron relationship has been found to exist in 
prokaryotes, archea, single-celled eukaryotes and multi-celled eukaryotes. 

Example of a prokaryote connectron - E. coli 

In this example the existence of the T1-T2 (3197-3308) long loop is controlled by 
three C1/C2 short loops (3307, 3432 and 2218), The T1-T2 long loop controls the 
expression of 64 genes on chromosome 1 in addition to six C1/C2 (3204, 3206, 3223, 
3228, 3301 and 3327) short loops. The C1/C2 short loop 3327 lies outside the range 
of the T1-T2 long loop (3197-3308) but this C1/C2 is expressed as a 3'UTR to the 
gene hemG that is within the range of the T1-T2 long loop. 



3307 Chromosome 1 
3432 Chromosome 1 
2218 Chromosome 1 



I Chromosome 1 | 
3197 3308 

I 3204 3206 | 

I 3224 3228 

I 3301 3327 
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Connectron control elements for chromosome 1 of the E. coli genome 

A double stranded DNA loop of length 93.542 kilo-bases on chromosome 1 is 
5 bounded on the left by a Tl sequence whose identifier is 3197. This Tl control 

element has the DNA sequence 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGG 
AATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACG 
10 CCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAA 
ATAAATGCTTGACTCTGTAGCGGGAA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3308. This T2 control element has the DNA sequence 

15 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTG 
ACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAG 
AACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAG 
GCGTATTATGCACACCCCGCGCCGCT 

20 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 





rrsC 


gltU 


rrIC 


rrfC aspT trpT 


yifA 


yifE 


yifB 


25 


ilvL 


ilvG_l 


ilvM 


ilvE ilvD ilvA 


ilvY 


ilvC 


ppiC 




b3776 


rep 


gPPA 


rhlB trxA rhoL 


rho 


rfe 


wzzE 




wecB 


rffH 


wecD 


wecE wzxE 


yifM_2 


wecG 


yifK 




argX 


hisR 


leuT 


proM aslB 


aslA 


hemY 


hemX 




hemD 


cyaA 


cyaY 


b3808 dapF 


uvrD 


b3814 


corA 


30 


yigF 


yigG 


rarD 


yigl pldA recQ 


yigJ 


yigK 


pldB 




yigL 


yigM 


metR 


metE ysgA udp 


yigN 


ubiE 


yigP 
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b3836 yigU yigW_l rfaH yigC ubiB fadA fadB 
pepQ trkH hemG 



This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 3204 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene rrsC and has the 
DNA sequence 

GATGTGCCCAGATGGGATTAGCTAGTAGGTGGGGTAACGGCTCACCTAGG 
CGACGATCCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAG 
ACACGGTCCAGACTCCTACGGGAGGCAGCAGTGGGGAATATTGCACAATG 
GGCGCAAGCCTGATGCAGCCATGCCGCGTGTATGAA 

A C1/C2 short loop on chromosome 1 whose identifier is 3206 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3*UTR to the gene rrsC and has the 
DNA sequence 

GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAG 
GGGTTCGAATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTC 

ACCTGCCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTG 
CTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAA 

A C1/C2 short loop on chromosome 1 whose identifier is 3223 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene rrlC and has the 
DNA sequence 



-41 - 



GCTGAAGTAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAAGTGGTACGC 
GAGCTGGGTTTAGAACGTCGTGAGACAGTTCGGTCCCTATCTGCCGTGGG 
CGCTGGAGAACTGAGGGGGGCTGCTCCTAGTACGAGAGGACCGGAGTGG 
ACGCATCACTGGTGTTCGGGTTGTCATGCCAATGGCA 

A C1/C2 short loop on chromosome 1 whose identifier is 3225 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene rrlC and has the 
DNA sequence 

AAACAGAATTTGCCTGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCAT 
GCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTC 
CCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATTAAGCAGTA 

A C1/C2 short loop on chromosome 1 whose identifier is 3228 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene rrfC and has the 
DNA sequence 

GGTCATAAAACCGGTGGTTGTAAAAGAATTCGGTGGAGCGGTAGTTCAGT 
CGGTTAGAATACCTGCCTGTCACGCAGGGGGTCGCGGGTTCGAGTCCCGT 
CCGTTCCGCCAC 

A C1/C2 short loop on chromosome 1 whose identifier is 3301 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene ubiB and has the 
DNA sequence 

TTATCGTGCCTACAAATAGTCCGAACCGTAGGCCGGATAAGGCGTTTACG 
CCGCATC 
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A C1/C2 short loop on chromosome 1 whose identifier is 3307 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene fadA and has the 
DNA sequence 

TGCCGGATGCGGCGTAAACGCCTTATCCGGCCTACGGTTCGGACTATTTGT 
AGGCA 

A C1/C2 short loop on chromosome 1 whose identifier is 3327 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene hemG and has the 
DNA sequence 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGG 

AATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACG 

CCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAA 

ATAAATGCTTGACTCTGTAGCGGGAAGGCGTATTATG...CCCGTCACACCA 

TGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTAACCTTCGGGAGGGCGCT 

TACCACTTTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTA 

GGGGAACCTGCGGTTGGATCACCTCCTTACCTTAAAGAAGCGTTCTTTG 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 3307 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene hemG and has the DNA sequence 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGG 
AATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACG 
CCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAA 
ATAAATGCTTGACTCTGTAGCGGGAAGGCGTATTATG...CCCGTCACACCA 
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TGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTAACCTTCGGGAGGGCGCT 

TACCACTTTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTA 

GGGGAACCTGCGGTTGGATCACCTCCTTACCTTAAAGAAGCGTTCTTTG 

The match between the Tl sequence and the C1/C2 sequence is 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGG 
AATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACG 
CCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAA 
ATAAATGCTTGACTCTGTAGCGGGAA 

The match between the T2 sequence and the C1/C2 sequence is 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTG 
ACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAG 
AACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAG 
GCGTATTATGCACACCCCGCGCCGCT 

A C1/C2 short loop on chromosome 1 whose identifier is 3432 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene btuB and has the DNA sequence 

TGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 

CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGG 

GTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATG 

CTTGACTCTGTAGCGGGAAGGCGTATTATGCACACC..ACACCATGGGAGT 

GGGTTGCAAAAGAAGTAGGTAGCTTAACCTTCGGGAGGGCGCTTACCACT 

TTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTAGGGGAAC 

CTGCGGTTGGATCACCTCCTTACCTTAAAGAAGCGT 

The match between the Tl sequence and the C1/C2 sequence is 
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TGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGG 
GTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATG 
CTTGACTCTGTAGCGGGAA 

5 

The match between the T2 sequence and the C1/C2 sequence is 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTG 
ACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAG 
1 0 AACTCCGGCAGAGAAAGC AAAAATAAATGCTTG ACTCTGTAGCGGGAAG 

GCGTATTATGCACACCCCGCGCCGCT 

A C1/C2 short loop on chromosome 1 whose identifier is 2218 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
15 as a RNA single strand that is 3^UTR to the gene clpB and has the DNA sequence 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAAC 
AACGGCAAACACGCCGCCGGGC 

20 The match between the Tl sequence and the C1/C2 sequence is 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAAC 
AACGGCAAACACGCCGCCGGGC 

25 The match between the T2 sequence and the C1/C2 sequence is 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAAC 
AACGGCAAACACGCCGCCGGGTC 



Example of an archea connectron - H, pylori 
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In this example the existence of the T1-T2 (812-882) long loop is controlled by three 
CI/C2 short loops (881, 813 and 1214), The T1-T2 long loop controls the expression 
of 54 genes on chromosome 1 in addition to one C1/C2 (843) short loop. 



10 



15 



881 Chromosome 1 
813 Chromosome 1 
1241 Chromosome 1 

* * 

Chromosome 1 
812 882 
I 842 I 



Connectron control elements for chromosome 1 of H. pylori genome 

A double stranded DNA loop of length 96385 kilo-bases on chromosome 1 is 
20 bounded on the left by a Tl sequence whose identifier is 812. This Tl control 

element has the DNA sequence 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 

25 This double stranded DNA loop is bounded on the right by a T2 control element 

whose identifier is 882. This T2 control element has the DNA sequence 

TAGCGGAACTAAAGCATTCATCCCAAACACTAAAGATATTTGG 

30 This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

HP0999 HPIOOO HPlOOl HP1002 HP1003 HP1005 HP1006 

HP1008 HP1009 HPtRNA-Pro HPlOlO HPlOll HP1013 HP1015 

35 HP1017 HP1018 HP1020 HP1021 HP1022 HP1023 HP1024 
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HP 1025 


HP 1 027 


HP 1028 

X XX X \J \J 


HP1030 


HP 1031 


A. i.X 1 ^ 


HP 1 034 


HP1038 


HP1039 

X XX X \f ^ ^ 


HP 1040 


HP1041 


HP1042 




HP 1044 


HP 1045 


HP 1046 


HP1051 


HP 1052 


HP1055 


HP1056 


HP1058 


HP 1060 


HP 1065 


HPtRNA-Ser 


HP1066 


HP1067 


HP 1069 


HP 1070 


HP 1074 


HP 1075 


HP 1076 


HP 1077 


HP 1078 


HP 1079 


HP1080 


HP1081 


HP 1083 


HP1084 


HP1085 


HP1088 


HP1091 


HP 1092 


HP 1093 


HP 1094 


HP1095 HP1096 









This long T1/T2 double stranded DNA loop modulates the expression of the 
1 0 following C 1 /C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 813 controls the expression 
of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene HP0998 and has the DNA 
15 sequence 



.0 TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCC 
11 AAACACTAAAGATATTTGG 

if — :; 

J'S 20 The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 

l^j short loops. 

H A C1/C2 short loop on chromosome 1 whose identifier is 881 controls the expression 

of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
25 expressed as a RNA single strand that is 3'UTR to the gene HP1096 and has the DNA 

sequence 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCC 
AAACACTAAAGATATTTGG 

30 

The match between the Tl sequence and the C1/C2 sequence is 
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TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 



The match between the T2 sequence and the CI/C2 sequence is 

TAGCGGAACTAAAGCATTCATCCCAAACACTAAAGATATTTGG 

The expression of genes in this TlAr2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 813 controls the expression 
of the genes in this TI/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene HP0998 and has the DNA sequence 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCC 
AAACACTAAAGATATTTGG 

A C1/C2 short loop on chromosome 1 whose identifier is 881 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene HP 1096 and has the DNA sequence 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCC 
AAACACTAAAGATATTTGG 

The match between the Tl sequence and the C1/C2 sequence is 
TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 
The match between the T2 sequence and the C1/C2 sequence is 
TAGCGGAACTAAAGCATTCATCCCAAACACTAAAGATATTTGG 
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A C1/C2 short loop on chromosome 1 whose identifier is 1241 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene HP 153 5 and has the DNA sequence 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCC 
AAACA 

The match between the Tl sequence and the C1/C2 sequence is 
TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 
The match between the T2 sequence and the C1/C2 sequence is 
TAGCGGAACTAAAGCATTCATCCCAAACA 



Example of single-celled connectron - S. cervesiae 

In this example the existence of the T1-T2 (1352-1416) long loop on chromosome 4 
is controlled by one C1/C2 short loop (4213) on chromosome 10. The T1-T2 long 
loop controls the expression of 34 genes on chromosome 4 in addition to one C1/C2 
(1356) short loop. 

4213 Chromosome 10 

* * * 

I Chromosome 4 1 
1352 1416 
1 1356 I 



Connectron control elements for chromosome 1 of S. cervesiae genome 
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A double stranded DNA loop of length 68.908 kilo-bases on chromosome 4 is 
bounded on the left by a Tl sequence whose identifier is 1352. This Tl control 
element has the DNA sequence 

5 

TTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 1416. This T2 control element has the DNA sequence 

10 

ATTAGATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCAACTATCA 
TCTACTAACTAGTATTTACGTTACTAGTATATTATCATATACGGTGTTAGA 
AGATGACGCAAATGATGAGAAATAGTCATCTAAATTAGTGGAAGCTGAA 
C ACGCAAGGATTGATAATGTAATAGGATCAATGAATATTAACATATAAAAC 
J 1 5 GATGATAATAATATTTATAGAATTGTGTAG AATTGC AGATTCCCTTTTATG 

2 GATTCCTAAATCCTTGAGGAGAACTTCTAGTATATCTACATACCTAATATT 
i ATAGCCTTAATCACAATGGAATCCCAACAATTACATCAAAATCCACATTC 
% TCTACAGTA 

^ 20 This long T1/T2 double stranded DNA loop modulates the expression of the 

yj following genes 



YDR170W-A YDR171W YDR172W YDR173C YDR174W YDR175C 
YDR176W YDR177W YDR178W YDR179C YDR179W-A YDR180W 
25 YDR181C YDR182W YDR183W YDR184C YDR185C YDRI86C 

YDR187C YDR188W YDR189W YDR190C YDR191W YDR192C 
YDR193W YDR194C YDR195W YDR196C YDR197W YDR198C 
YDR199W YDR200C YDR201W YDR202C YDR203W YDR204W 
YDR205W YDR206W YDR207C YDR208W YDR209C YDR210W 

30 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 
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A C1/C2 short loop on chromosome 4 whose identifier is 1356 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene YDR170W-A and 
has the DNA sequence 

AATCACACTAATCATTCTGATGATGAACTCCCTGGACACCTCCTTCTCGAT 

TCAGGAGCATCACGAACCCTTATAAGATCTGCTCATCACATACACTCAGC 

ATCATCTAATCCTGACATAAACGTAGTTGATGCTCAAAAAAGAAATATAC 

CAATTAACGCTATTGGTGACCTACAATTTCACTTCCAGGACAACACCAAA 

ACATCAATAAAGGTATTGCACACTCCTAACATAGCCTATGACTTACTCAGT 

TTGAATGAATTGGCTGCAGTAGATATCACAGCATGCTTTACCAAAAACGT 

CTTAGAACG 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 10 whose identifier is 4213 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YJR029W and has the DNA 
sequence 

ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTAT 

CAACTAATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGAT 

GACATAAGTTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAACGC 

AAGGATTGATAATGTAATAGGATCAATGAATATAAACATATAAAACGGA 

ATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGA 

GGATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTATACCTAATA 

TTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACAT 

The match between the Tl sequence and the C1/C2 sequence is 
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TTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAA 
The match between the T2 sequence and the C1/C2 sequence is 
ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATC 



Example of a multi-celled connectron — C. elegans 



In this example the existence of the T1-T2 (9-138) long loop on chromosome 1 is 
controlled by three C1/C2 short loops on chromosome 5 (21719, 21949 and 21655). 
The T1-T2 long loop controls the expression of four genes on chromosome 1 in 
addition to seven C1/C2 (1 19, 122, 125, 130, 132, 134 and 136) short loops. 



21719 Chromosome 5 
21949 Chromosome 5 
21655 Chromosome 5 



■¥ * 

I Chromosome 1 

95 138 

I 119 122 I 

I 125 130 1 

I 132 134 I 
I 136 



A double stranded DNA loop of length 41.978 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 95. This Tl control element 
has the DNA sequence 
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CAGCACGTTCTTAACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTC 
CCGC 



This double stranded DNA loop is bounded on the right by a T2 control element 
5 whose identifier is 138, This T2 control element has the DNA sequence 

ACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATCA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
10 following genes 

Y73A3A.1 Y73A3AT ZC123.3 ZC123.2 

This long T1/T2 double stranded DNA loop modulates the expression of the 
1 5 following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 119 controls the expression 
of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3*UTR to the gene ZC123.3 and has the 
20 DNA sequence 

TTGAGAACTCTGCGTCTCAACTCCCGCATTTTTTGTAGATCTACGTAGATC 
AAACCGAAATGGGACACT 

25 A C1/C2 short loop on chromosome 1 whose identifier is 122 controls the expression 

of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene ZC123.3 and has the 
DNA sequence 

30 GCACGGGGTTCTGGCCTTCCTCATTGAATTTTTCGCGCTCCATTGACAATC 
GCCTGCCGGACAACGCGTGGGAAAGTCGTGTACTCCAC 
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A C1/C2 short loop on chromosome 1 whose identifier is 125 controls the expression 
of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene ZC123.3 and has the 
DNA sequence 

5 

ACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGTTC 
GGCAAACTCTTTCATTTCAATTTATGAGGGAAGCCAGAA 

A C1/C2 short loop on chromosome 1 whose identifier is 130 controls the expression 
10 of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 

expressed as a RNA single strand that is 3'UTR to the gene ZC123.2 and has the 
DNA sequence 

□ CTCCCGCATTTTTTGTAGATCTACGTAGATCAAACCGAAATGAGGCACTTT 
Jl 1 5 CTGAATCC ACGAGCTAGGCTTAAGCTTAGGCTTAAGCTTAGGCCTTTTCTC 

2 AGGCTTAGGCTTAGGCTTA 

IZ A C1/C2 short loop on chromosome 1 whose identifier is 132 controls the expression 

of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
20 expressed as a RNA single strand that is 3'UTR to the gene ZC123.2 and has the 

|ii DNA sequence 

|i GCTTATGCTTGGGCTTAGGCTTAGGCGTAGGCTTAGGCTTAGGCTTAGGCT 

TATGCTTAGACTTAGTCTCACTATCAGTCTTAGGCTTAGGCTTAGACTTAG 
25 GCTTAAGCTTAGGCTTAAGCTTAGACTTAGGCTTAGGCTTAGGCTTAGGCT 
TAGGCTTAGGTTTGGGCTTAGGCTTAGGCTTAACCTC 

A C1/C2 short loop on chromosome 1 whose identifier is 134 controls the expression 
of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
30 expressed as a RNA single strand that is 3'UTR to the gene ZC123.2 and has the 

DNA sequence 
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TCTGCGTCTTTTCTCCCGCATTTTTTGTAGATCTACGTAGATCAAACCGAA 

ATGAGGCACTTTCTGAATCCACGAGCTAGGCTTAAGCTTAGGCTTAAGCTT 

AGGCCTTTTCTCAGGCTTAGGCTTAGGCTTA 

A C1/C2 short loop on chromosome 1 whose identifier is 136 controls the expression 
of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene ZC123.2 and has the 
DNA sequence 

GCTTATGCTTGGGCTTAGGCTTAGGCGTAGGCTTAGGCTTAGGCTTAGGCT 
TATGCTTAGACTTAGTCTCACTATCAGTCTTAGGCTTAGGCTTAGACTTAG 
GCTTAAGCTTAGGCTTAAGCTTAGACTTAGGCTTAGGCTTAGGCTTAGGCT 
TAGGCTTAGGTTTGGGCTTAGGCTTAGGCTTAACCTC 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 5 whose identifier is 21719 controls the 
expression of the genes in this TI/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene C39F7.5 and has the DNA sequence 

ACGTTCTTAACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGC 
ATTTTTTGTAGATC 

The match between the Tl sequence and the C1/C2 sequence is 
ACGTTCTTAACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGC 
The match between the T2 sequence and the C1/C2 sequence is 
ACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATC 
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A C1/C2 short loop on chromosome 5 whose identifier is 21949 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene F16B4.4 and has the DNA sequence 

ACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGCATTTTTTGT 
AGATCTACGTAGATCAAGCCGAAATGAGACACTCTGACACCACG 

The match between the Tl sequence and the C1/C2 sequence is 

ACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGC 

The match between the T2 sequence and the C1/C2 sequence is 

ACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATC 

A C1/C2 short loop on chromosome 5 whose identifier is 21655 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene C39F7,3 and has the DNA sequence 

AACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGCATTTTTTG 
TAGATCTACG 

The match between the Tl sequence and the C1/C2 sequence is 
AACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGC 
The match between the T2 sequence and the C1/C2 sequence is 
ACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATC 
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2. Many Connectrons control the expression of one set of genes in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes. 

Many different C1/C2 short loops can control the existence of one T1-T2 long loop. 
The C1/C2 short loops can be on the same chromosome or on different chromosomes 
from the T1-T2 long loop. This relationship is described as "many-to-one". This 
relationship exists in prokaryotes, archea, single-celled eukaryotes and multi-celled 
eukaryotes 

Example of a many-to-one connectron in prokaryotes — E. coli 

In this example the existence of the T1-T2 (3197-3308) long loop is controlled by 
three C1/C2 short loops (3307, 3432 and 2218). 

3307 Chromosome 1 
3432 Chromosome 1 
2218 Chromosome 1 

* * * 

I Chromosome 1 | 

3197 3308 



A double stranded DNA loop of length 93.542 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 3197. This Tl control 
element has the DNA sequence 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGG 
AATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACG 
CCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAA 
ATAAATGCTTGACTCTGTAGCGGGAA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3308. This T2 control element has the DNA sequence 
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TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTG 
ACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAG 
AACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAG 
5 GCGTATTATGCACACCCCGCGCCGCT 



This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
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h; 20 

yj The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 

short loops. 



A C1/C2 short loop on chromosome 1 whose identifier is 3307 controls the 
25 expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 

as a RNA single strand that is 3'UTR to the gene hemG and has the DNA sequence 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGG 
AATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACG 
30 CCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAA 
ATAAATGCTTGACTCTGTAGCGGGAAGGCGTATTATG. . .GGAGTCTGC AAC 
TCGACTCCATGAAGTCGGAATCGCTAGTAATCGTGGATCAGAATGCCACG 
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GTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGT 
GGGTTGCAAAAGAAGTAGGTAGCTTAACCTTCGGGAGGGCGCTTACCACT 
TTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTAGGGGAAC 
CTGCGGTTGGATCACCTCCTTACCTTAAAGAAGCGTTCTTTG 

The match between the Tl sequence and the C1/C2 sequence is 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGG 
AATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACG 
CCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAA 
ATAAATGCTTGACTCTGTAGCGGGAA 

The match between the T2 sequence and the C1/C2 sequence is 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTG 
ACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAG 
AACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAG 
GCGTATTATGCACACCCCGCGCCGCT 

A C1/C2 short loop on chromosome 1 whose identifier is 3432 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene btuB and has the DNA sequence 

TGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 

CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGG 

GTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATG 

CTTGACTCTGTAGCGGGAAGGCGTATTATGCACACC...ACACCATGGGAGT 

GGGTTGCAAAAGAAGTAGGTAGCTTAACCTTCGGGAGGGCGCTTACCACT 

TTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTAGGGGAAC 

CTGCGGTTGGATCACCTCCTTACCTTAAAGAAGCGT 

The match between the Tl sequence and the C1/C2 sequence is 
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TGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGG 
GTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATG 
5 CTTGACTCTGTAGCGGGAA 

The match between the T2 sequence and the C1/C2 sequence is 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTG 
1 0 ACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAG 
AACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAG 
GCGTATTATGCACACCCCGCGCCGCT 

A C1/C2 short loop on chromosome 1 whose identifier is 2218 controls the 
15 expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 

as a RNA single strand that is 3UTR to the gene clpB and has the DNA sequence 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAAC 
AACGGCAAACACGCCGCCGGGC 

20 

The match between the Tl sequence and the C1/C2 sequence is 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAAC 
AACGGCAAACACGCCGCCGGGC 

25 

The match between the T2 sequence and the C1/C2 sequence is 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAAC 
AACGGCAAACACGCCGCCGGGC 

30 
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Example of a many-to-one connectron in archea - M. jannaschii 

In this example the existence of the T1-T2 (1630-1643) long loop is controlled by 
four C1/C2 short loops (1629, 1642, 124 and 1533). 

1629 Chromosome 1 
1642 Chromosome 1 
124 Chromosome 1 
1533 Chromosome 1 



I Chromosome 1 

1630 1643 



A double stranded DNA loop of length 4.998 kilo-bases on chromosome 1 is bounded 
on the left by a Tl sequence whose identifier is 1630. This Tl control element has 
the DNA sequence 

TTATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTT^ 
GATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAA^ 
AGATTAATTAGGAAAGGAAATAAGATTTCTCTAACAGACAAGTTAAATTT 
TTGGATTTAAAAAGATAAAAAT 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 1643. This T2 control element has the DNA sequence 

TTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTG 
AATTATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAAT 
TTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCT 
GTTTTATTATGGAAAGAAAGAT 

This long TIA'2 double stranded DNA loop modulates the expression of the 
following genes 
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MJ1597 MJ1598 MJ1599 MJ1600 MJ1601 MJ1602 

The expression of genes in this T1/T2 long loop is controlled by the following CI/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 1629 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene MJ1597 and has the DNA sequence 

ATATGTTTGAAATTTGAAAATAAGAGTATTTAGAAGTTATTAATTAGTTCA 
AAGGATTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATAT 
TTGAGTTTATTGAATTATTCAGATTTTTAAAAATTA 

The match between the Tl sequence and the C1/C2 sequence is 

TTATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTTTGCTGGTTT 
GATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAAAAATTA 

The match between the T2 sequence and the C1/C2 sequence is 

GCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTA 
AAAATTA 

A C1/C2 short loop on chromosome 1 whose identifier is 1642 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene MJ1602 and has the DNA sequence 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
TGAATTATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAA 

ATTTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACT 
CTGTTTTATTATGGAAAGAAAGAT 
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The match between the Tl sequence and the C1/C2 sequence is 



GCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTITTA 
AAAATTA 

The match between the T2 sequence and the C1/C2 sequence is 

TTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTG 
AATTATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAAT 

TTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCT 
GTTTTATTATGGAAAGAAAGAT 

A C1/C2 short loop on chromosome 1 whose identifier is 124 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene MJ0112 and has the DNA sequence 

ATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
TGAATTATTCAGATTTTTAAAAT 

The match between the Tl sequence and the C1/C2 sequence is 

ATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
TGAATTATTCAGATTTTTAAAAT 

The match between the T2 sequence and the C1/C2 sequence is 

GCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTA 
AAAAT 
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A C1/C2 short loop on chromosome 1 whose identifier is 1533 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene MJ1486 and has the DNA sequence 

TTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAG 
TTTATT 

The match between the Tl sequence and the C1/C2 sequence is 

TTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATATTTGAG 
TTTATT 

The match between the T2 sequence and the C1/C2 sequence is 
GCTGGTTTGATTATTTAGAATATTTGAGTTTATT 



Example of a many-to-one connectron in single-cell eukaryotes - S. cervesiae 

In this example the existence of the T1-T2 (55 15-5533) long loop on chromosome 12 
is controlled by seventeen C1/C2 short loops (5516, 5532, 1939, 2323, 1942, 3286, 
3649, 4764, 4751, 5536, 6102, 8023, 7356, 3293, 3291, 3289 and 146). 

5516 Chromosome 12 
5532 Chromosome 12 
1939 Chromosome 4 
2323 Chromosome 5 
1942 Chromosome 5 
3286 Chromosome 7 
3649 Chromosome 8 
4764 Chromosome 12 
4751 Chromosome 12 
5536 Chromosome 13 
6102 Chromosome 14 
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8023 Chromosome 16 
7356 Chromosome 16 
3293 Chromosome 8 
3291 Chromosome 8 
3289 Chromosome 8 
146 Chromosome 2 



I Chromosome 12 

3197 3308 



A double stranded DNA loop of length 6.466 kilo-bases on chromosome 12 is 
bounded on the left by a Tl sequence whose identifier is 5515, This Tl control 
element has the DNA sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTG 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 5533. This T2 control element has the DNA sequence 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 

ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 

ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 

TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

YLR467W 
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This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 12 whose identifier is 5516 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene YLR464W and has 
the DNA sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 

AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 12 whose identifier is 5532 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3*UTR to the gene YLR467W and has 
the DNA sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 

AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 4 whose identifier is 1939 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
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as a RNA single strand that is 3'UTR to the gene YDR545W and has the DNA 
sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGG 

The match between the Tl sequence and the C1/C2 sequence is 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGG 

The match between the T2 sequence and the C1/C2 sequence is 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 
ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 
ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 
TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGG 

A C1/C2 short loop on chromosome 5 whose identifier is 2323 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YER189W and has the DNA 
sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
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GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence is 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence is 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 

ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 

ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 

TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTA 

AGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 5 whose identifier is 1942 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YEL077C and has the DNA 
sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 

AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence is 
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AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
5 GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence is 

1 0 ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 

ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 

ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 

TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTA 

15 AGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 7 whose identifier is 3286 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YGR296W and has the DNA 
20 sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
25 GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence is 

30 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
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GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence is 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 

ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 

ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 

TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 8 whose identifier is 3649 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YHR219W and has the DNA 
sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 

AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence is 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence is 
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ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 

ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 

ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 

TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 12 whose identifier is 4764 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YLL066C and has the DNA 
sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 

AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence is 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence is 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 

ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 

ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 

TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 



-71 - 



A C1/C2 short loop on chromosome 12 whose identifier is 4751 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YLL067C and has the DNA 
sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 

ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 

AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the CI/C2 sequence is 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 

AGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence is 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 
ATATGCGTnTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 
ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 
TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 

TCCGGGTAAGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 13 whose identifier is 5536 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YML133C and has the DNA 
sequence 
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AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
5 GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence is 

1 0 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTG 

15 

The match between the T2 sequence and the C1/C2 sequence is 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 
ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 
20 ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 
TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 

A C1/C2 short loop on ciiromosome 14 whose identifier is 6102 controls the 
25 expression of the genes in this TlAr2 long loop. This C1/C2 short loop is expressed 

as a RNA single strand that is 3'UTR to the gene YNL339C and has the DNA 
sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
30 ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
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GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence is 

5 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
1 0 AGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence is 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 
15 ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 
ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 
TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 

20 A C1/C2 short loop on chromosome 16 whose identifier is 8023 controls the 

expression of the genes in this T1/T2 long loop. This CI/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YPR204W and has the DNA 
sequence 

25 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

30 

The match between the Tl sequence and the C1/C2 sequence is 
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AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
5 AGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence is 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 
1 0 ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 
ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 
TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 

15 A C1/C2 short loop on chromosome 16 whose identifier is 7356 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YPL283C and has the DNA 
sequence 

20 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

25 

The match between the Tl sequence and the C1/C2 sequence is 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
30 TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTG 
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The match between the T2 sequence and the C1/C2 sequence is 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 
5 ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 
ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 
TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 

10 A C1/C2 short loop on chromosome 8 whose identifier is 3293 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YHL050C and has the DNA 
sequence 

15 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTT 

The match between the Tl sequence and the C1/C2 sequence is 

20 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTT 

The match between the T2 sequence and the C1/C2 sequence is 

25 ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 
ATATGCGTTTT 

A C1/C2 short loop on chromosome 8 whose identifier is 3291 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
30 as a RNA single strand that is 3'UTR to the gene YHL050C and has the DNA 

sequence 
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ATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAA 

The match between the Tl sequence and the C1/C2 sequence is 

5 

ATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAA 

The match between the T2 sequence and the C1/C2 sequence is 

10 

ATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAA 

A C1/C2 short loop on chromosome 2 whose identifier is 145 controls the expression 
15 of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 

single strand that is 3'UTR to the gene YBLl 13C and has the DNA sequence 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGCT 

20 

The match between the Tl sequence and the C1/C2 sequence is 
CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 
25 The match between the T2 sequence and the C1/C2 sequence is 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGCT 

30 A C1/C2 short loop on chromosome 8 whose identifier is 3289 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
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as a RNA single strand that is 3'UTR to the gene YHL050C and has the DNA 
sequence 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
5 ATCCGGGTAAGAGACAACAGGCT 

The match between the Tl sequence and the C1/C2 sequence is 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

10 

The match between the T2 sequence and the C1/C2 sequence is 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGCT 

A C1/C2 short loop on chromosome 2 whose identifier is 146 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene YBLl 13C and has the DNA sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAA 

The match between the Tl sequence and the C1/C2 sequence is 

25 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAA 

The match between the T2 sequence and the C1/C2 sequence is 
30 ATTATGTATTGTGTAGTATAGTATATTGTAAGAAA 
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Example of a many-to-one connectron in multi-cell eukaryotes - C. elegans 



In this example the existence of the T1-T2 (3197-3308) long loop on chromosome 5 
5 is controlled by three C1/C2 short loops (4382, 4375 and 28633). 



4382 Chromosome 1 
4375 Chromosome 1 
10 28633 Chromosome 5 



Chromosome 5 
28632 28697 

15 



A double stranded DNA loop of length 58.451 kilo-bases on chromosome 5 is 
bounded on the left by a Tl sequence whose identifier is 28632. This Tl control 
20 element has the DNA sequence 

GCAAAAATTGACTGAAAATTTGAATTTCCCGCAAAAAATTGACTGAAAAT 
TTGAATTTCCCGCCAAAAATTGACTGAAAATTTGAA 

25 This double stranded DNA loop is bounded on the right by a T2 control element 

whose identifier is 28697. This T2 control element has the DNA sequence 

CAAAAAATTGACTGAAAATTTGAATTTCCCTCCAAAAATTGACTGAAAAT 
TTGAATTTCCCGCCAAAAATTGACTGAAAATTTGAATATCCCGCCAAAAA 
30 TTGACTGAAAATTTGAATTTCCCGCCGAAAATTAAATGAAAAATGGAATT 
TCTCGCCGAA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

35 
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M162.8 M162.4 M162.3 M162,6 M162,2 M162.1 M162.7 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

5 

A C1/C2 short loop on chromosome 1 whose identifier is 4382 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3WR to the gene Y43F8B.10 and has the DNA 
sequence 

10 

ATTATAGAAAATTTAAATTTCCCTCCAAAAAATTGACTGAAAATTTGAATT 
TCCCTCCAAAAATTGACTGAAAATTTGAATTTCCCGCCAAAAATTGACTG 
AAAATTTGAATATCCCGCCAAAAATTGACTGAAAATTTGAATTTCCCGCC 
5 GAAAATTAAATGAAAAATGGAATTTCTCGCCGAAAAATTCAGTAAAAATT 
S 15 TGAATTTCCTGCCAAAAATTGACTGAAAATTTGAATITCTTGCCAAAAAA 
% GTGACTGGGAATTTGAATTTCCCTCCAAAAATTGACTGAAATTTTGAATTT 
^0 CCCGCTAAAAGTTGACT 

:i The match between the Tl sequence and the C1/C2 sequence is 

U 20 

yi CAAAAATTGACTGAAAATTTGAATTTCCCGC 



The match between the T2 sequence and the C1/C2 sequence is 

25 CAAAAAATTGACTGAAAATTTGAATTTCCCTCCAAAAATTGACTGAAAAT 
TTGAATTTCCCGCCAAAAATTGACTGAAAATTTGAATATCCCGCCAAAAA 
TTGACTGAAAATTTGAATTTCCCGCCGAAAATTAAATGAAAAATGGAATT 
TCTCGCCGAA 

30 A CI/C2 short loop on chromosome 1 whose identifier is 4375 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
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as a RNA single strand that is 3'UTR to the gene Y43F8B.10 and has the DNA 
sequence 

ATTATAGAAAATTTAAATTTCCCTCCAAAAAATTGACTGAAAATTTGAA^ 
5 TCCCTCCAAAAATTGACTGAAAATTTGAATTTCCCGCCAAAAATT 

AAAATTTGAATATCCCGCCAAAAATTGACTGAAAATTTGAATTTCCCGCC 
GAAAATTAAATGAAAAATGGAATTTCTCGCCGAAAAATTCAGTAAAAAT 
TGAATTTCCTGCCAAAAATTGACTGAAAATTTGAATTTCTTGCCAAAAAA 
GTGACTGGGAATTTGAATTTCCCTCCAAAAATTGACTGAAATTTTGAATTT 
1 0 CCCGCTAAAAGTTGACT 

The match between the Tl sequence and the C1/C2 sequence is 

CAAAAATTGACTGAAAATTTGAATTTCCCGC 

15 

The match between the T2 sequence and the C1/C2 sequence is 

CAAAAAATTGACTGAAAATTTGAATTTCCCTCCAAAAATTGACTGAAAAT 
TTGAATTTCCCGCCAAAAATTGACTGAAAATTTGAATATCCCGCCAAAAA 
20 TTGACTGAAAATTTGAATTTCCCGCCGAAAATTAAATGAAAAATGGAATT 
TCTCGCCGAA 

A C1/C2 short loop on chromosome 5 whose identifier is 28633 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
25 as a RNA single strand that is 3'UTR to the gene Ml 62.5 and has the DNA sequence 

CAAAAATTGACTGAAAATTTGAATTTCCCGCAAAAAATTGACTGAAAATT 
TGAATTTCCCGCCAAAAATTGACTGAAAATTTGAA 

30 The match between the Tl sequence and the CiyC2 sequence is 
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CAAAAATTGACTGAAAATTTGAATTTCCCGCAAAAAATTGACTGAAAATT 
TGAATTTCCCGCCAAAAATTGACTGAAAATTTGAA 

The match between the T2 sequence and the C1/C2 sequence is 
CAAAAAATTGACTGAAAATTTGAATTTCCC 
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3, One connectron controls the expression of many sets of genes in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes. 



One C1/C2 short loop can control the existence of a many T1-T2 long loops. The 
5 C1/C2 short loop can be on the same chromosome or on different chromosomes from 

the TI-T2 long loops. This relationship is described as "one-to-many". This 
relationship exists in prokaryotes, archea, single-celled eukaryotes and multi-celled 
eukaryotes. 

10 Example of a one-to-many connectron in prokaryotes - E. coli 

In this example the existence of T1-T2 (3208-3315, 3436-3476, 3439-3478 and 
3441-3479) long loops are controlled by one C1/C2 short loop (3206). 

15 3206 Chromosome 1 



20 



30 



35 



* * * 

1 Chromosome 1 

3208 3315 



3206 Chromosome 1 



* * * 



25 I Chromosome 1 | 

3436 3476 



3206 Chromosome 1 

* * * 

I Chromosome 1 

3439 3478 



3206 Chromosome 1 



* * * 

I Chromosome 1 

40 3441 3479 
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A double stranded DNA loop of length 93.377 kilo-bases on chromosome 1 is 
5 bounded on the left by a Tl sequence whose identifier is 3208. This Tl control 

element has the DNA sequence 

ACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCA 
AGCTGAAAATTGAAACACTGAACAACGAAAGTTGTTCGTGAGTCTCTCAA 
1 0 ATTTTCGCAACACGATGATGAATCGAAAGAAACATCTTCGGGTTGTGAGG 
TTAAGCGACTAAGCGTACACGGTGGATGCCCTGGC...AGTGTGTTTCGACA 
CACTATCATTAACTGAATCCATAGGTTAATGAGGCGAACCGGGGGAACTG 
AAACATCTAAGTACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCAGTA 
GCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGAATCAGT 

15 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3315. This T2 control element has the DNA sequence 

TTTGCTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAACACTGAACAAC 
20 GAAAGTTGTTCGTGAGTCTCTCAAATTTTCGCAACTCTGAAGTGAAACATC 
TTCGGGTTGTGAGGTTAAGCGACTAAGCGTACACGGTGGATGCCCTGGCA 
GTCAGAGGCGATGAAGGACGTGCTAATCTGCGATA...GGTTAATGAGGCG 
AACCGGGGGAACTGAAACATCTAAGTACCCCGAGGAAAAGAAATCAACC 
GAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGA 
25 ATCAGTGTGTGTGTTAGTGGAAGCGTCTGGAAA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

30 rrlC rrfC aspT trpT yifA yiffi yi£B ilvL ilvG_l 

ilvM ilvE ilvD ilvA ilvY ilvC ppiC b3776 rep 

gppA rhlB trxA rhoL rho rfe wzzE wecB rffH 
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wecD wecE wzxE yifM_2 wecG yifK argX hisR 
leuT proM aslB aslA hemY hemX hemD cyaA 
cyaY b3808 dapF uvrD b3814 corA yigF yigG r^rD 
yigl pldA recQ yigj yigK pldB yigL yigM metR 
metE ysgA udp yigN ubiE yigP b3836 yigU 

yigW_l rfaH yigC ubiB fadA fadB pepQ trkH 
hemG rrsA ileT 

The expression of genes in this T1/T2 long loop is controlled by the following CI /C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 3206 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene rrsC and has the DNA sequence 

GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAG 
GGGTTCGAATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTC 
ACCTGCCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTG 
CTCTTTAAAA ATCTGG ATC AAGCTGAAAATTGAAA. . . ACCGGCGATTTCCG 
AATGGGGAAACCCAGTGTGTTTCGACACACTATCATTAACTGAATCCATA 
GGTTAATGAGGCGAACCGGGGGAACTGAAACATCTAAGTACCCCGAGGA 
AAAGAAATCAACCGAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAGCA 
GCCCAGAGCCTGAATCAGT 

The match between the Tl sequence and the C1/C2 sequence is 

ACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCA 
AGCTGAAAATTGAAACACTGAACAACGAAAGTTGTTCGTGAGTCTCTCAA 
ATTTTCGCAACACGATGATGAATCGAAAGAAACATCTTCGGGTTGTGAGG 
TTAAGCGACTAAGCGTACACGGTGGATGCCCTGGC...AGTGTGTTTCGACA 
CACTATCATTAACTGAATCCATAGGTTAATGAGGCGAACCGGGGGAACTG 
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AAACATCTAAGTACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCAGTA 
GCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGAATCAGT 

The match between the T2 sequence and the C1/C2 sequence is 

TTTGCTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAACACTGAACAAC 
GAAAGTTGTTCGTGAGTCTCTCAAATTTTCGCAAC 



A double stranded DNA loop of length 41.279 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 3436. This Tl control 
element has the DNA sequence 

ACGCAACGCGTGATAAGCAATTTTCGTGTCCCCTTCGTCTAGAGGCCCAG 

GACACCGCCCTTTCACGGCGGTAACAGGGGTTCGAATCCCCTAGGGGACG 

CCACTTGCTGGTT 



This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3476, This T2 control element has the DNA sequence 

AGTGAAAAGCAAGGCGTCTTGCGAAGCAGACTGATACGTCCCCTTCGTCT 
AGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAGGGGTTCGAATCCC 
CTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTCACCTGCCTTAATA 



This long TI/T2 double stranded DNA loop modulates the expression of the 
following genes 

gltT rrlB rrfB murB coaA b3975 tyrU thrT tufB 

secE nusG rplK rplA rplJ rplL rpoB rpoC htrC 

thiH thiF thiE yjaE yjaD hemE nfi yjaG hup A 

yjaH yjal hydH purD purH 
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This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 



5 A C1/C2 short loop on chromosome 1 whose identifier is 3206 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene rrsC and has the DNA sequence 

GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAG 
1 0 GGGTTCGAATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTC 
ACCTGCCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTG 
CTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAACACTGAACAACGAAA 
GTTGTTCGTGAGTCTCTCAAATTTTCGCAACACGATGATGAATCGAAAGA 
AACATCTTCGGGTTGTGAGGTTAAGCGACTAAGCGTACACGGTGGATGCC 
1 5 CTGGCAGTCAGAGGCGATGAAGGACGTGCTAATCTGCGATAAGCGTCGGT 
AAGGTGATATGAACCGTTATAACCGGCGATTTCCGAATGGGGAAACCCAG 
TGTGTTTCGACACACTATCATTAACTGAATCCATAGGTTAATGAGGCGAA 
CCGGGGGAACTGAAACATCTAAGTACCCCGAGGAAAAGAAATCAACCGA 
GATTCCCCCAGTAGCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGAAT 
20 CAGT 

The match between the Tl sequence and the C1/C2 sequence is 

GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAG 
25 GGGTTCGAATCCCCTAGGGGACGCCACTTGCTGGTT 

The match between the T2 sequence and the C1/C2 sequence is 

GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAG 
3 0 GGGTTCG AATCCCCTAGGGGACGCC ACTTGCTGGTTTGTGAGTGAAAGTC 

ACCTGCCTTAATA 
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A double stranded DNA loop of length 41.336 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 3439, This Tl control 
5 element has the DNA sequence 

CCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTT 
AAAAATCTGGATCAAGCTGAAAATTGAAACACTGAACAACGA 

10 This double stranded DNA loop is bounded on the right by a T2 control element 

whose identifier is 3478. This T2 control element has the DNA sequence 

GTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCAAGCTGAAAATTG 
AAACACTGAACAACGAAAGTTGTTCGTGAGTCTCTCAAATTTT 

15 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
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The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
25 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 3206 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the generrsC and has the DNA sequence 

30 

GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAG 
GGGTTCGAATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTC 
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ACCTGCCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTG 

CTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAA...ACCGGCGArrTCCG 

AATGGGGAAACCCAGTGTGTTTCGACACACTATCATTAACTGAATCCATA 

GGTTAATGAGGCGAACCGGGGGAACTGAAACATCTAAGTACCCCGAGGA 

AAAGAAATCAACCGAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAGCA 
GCCCAGAGCCTGAATCAGT 

The match between the Tl sequence and the C1/C2 sequence is 

CCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTT 
AAAAATCTGGATCAAGCTGAAAATTGAAACACTGAACAACGA 

The match between the T2 sequence and the C1/C2 sequence is 

GTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCAAGCTGAAAATTG 
AAACACTGAACAACGAAAGTTGTTCGTGAGTCTCTCAAATTTT 



A double stranded DNA loop of length 38.285 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 3441. This Tl control 
element has the DNA sequence 

AATTTTCGCAACACGATGATGAATCGAAAGAAACATCTTCGGGTTGTGAG 

GTTAAGCGACTAAGCGTACACGGTGGATGCCCTGGCAGTCAGAGGCGATG 

AAGGACGTGCTAATCTGCGATAAGCGTCGGTAAGGTGATATGAACCGTTA 

TAACCGGCGATTTCCGAATGGGGAAACCCAGTGTGT...GATGAGAGAAGA 

TTTTCAGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGATAAAA 

CAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCC 

GAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCC 
ATGCGAG 
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This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3479. This T2 control element has the DNA sequence 

AAGAAACATCTTCGGGTTGTGAGGTTAAGCGACTAAGCGTACACGGTGGA 

TGCCCTGGCAGTCAGAGGCGATGAAGGACGTGCTAATCTGCGATAAGCGT 

CGGTAAGGTGATATGAACCGTTATAACCGGCGATTTCCGAATGGGGAAAC 

CCAGTGTGTTTCGACACACTATCATTAACTGAATCC...CAGATTAAATCAG 

AACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTAGCGC 

GGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCG 

CCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCA 
TCAAATTA 



This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
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The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 3206controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene rrsC and has the DNA sequence 

GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAG 
GGGTTCGAATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTC 
ACCTGCCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATATnG 
CTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAA...ACCGGCGATTTCCG 
AATGGGGAAACCCAGTGTGTTTCGACACACTATCATTAACTGAATCCATA 
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GGTTAATGAGGCGAACCGGGGGAACTGAAACATCTAAGTACCCCGAGGA 
AAAGAAATCAACCGAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAGCA 
GCCCAGAGCCTGAATCAGT 

5 The match between the Tl sequence and the C1/C2 sequence is 

AATTTTCGCAACACGATGATGAATCGAAAGAAACATCTTCGGGTTGTGAG 
GTTAAGCGACTAAGCGTACACGGTGGATGCCCTGGCAGTCAGAGGCGATG 
AAGGACGTGCTAATCTGCGATAAGCGTCGGTAAGGTGATATGAACCGTTA 
1 0 TAACCGGCGATTTCCGAATGGGGAAACCCAGTGTGTTTCGACACACTATC 
ATTAACTGAATCCATAGGTTAATGAGGCGAACCGGGGGAACTGAAACATC 
TAAGTACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCAGTAGCGGCG 
AGCGAACGGGGAGCAGCCCAGAGCCTGAATCAGT 

15 The match between the T2 sequence and the C1/C2 sequence is 

AAGAAACATCTTCGGGTTGTGAGGTTAAGCGACTAAGCGTACACGGTGGA 
TGCCCTGGCAGTCAGAGGCGATGAAGGACGTGCTAATCTGCGATAAGCGT 
CGGTAAGGTGATATGAACCGTTATAACCGGCGATTTCCGAATGGGGAAAC 
20 CCAGTGTGTTTCGACACACTATCATTAACTGAATCCATAGGTTAATGAGGC 
GAACCGGGGGAACTGAAACATCTAAGTACCCCGAGGAAAAGAAATCAAC 
CGAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTG 
AATCAGT 

25 

Example of a one-to-many connectron in archea - M. jannaschii 

In this example the existence of T1-T2 (534-611, 1139-1159, and 1630-1643) long 
30 loops are controlled by one C1/C2 short loop (1642). 

1642 Chromosome 1 
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15 



20 



25 



* * ^ 

I Chromosome 1 

534 611 



1642 Chromosome 1 

* * ^ 

10 I Chromosome 1 | 

1139 1159 



1642 Chromosome 1 



* * * 

I Chromosome 1 

1630 1643 



A double stranded DNA loop of length 72.886 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 534. This Tl control 
element has the DNA sequence 

TAAGTAAATAAAATTTCTCTAACAAATAAGTTAAATT 



This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 61 1. This T2 control element has the DNA sequence 

TAAATAAAATTTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATA 
AAAATGCT 



This long T1/T2 double stranded DNA loop modulates the expression of the 
35 following genes 

MJ0486 MJ0487 MJ0488 MJ0489 MJ0490 MJ0492 MJ0493 

MJ0494 MJ0495 MJ0496 MJ0497 MJ0499 MJ0500 MJ0501 
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MJ0538 


MJ0539 
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MJ0541 


MJ0542 


MJ0543 


MJ0544 


MJ0545 


MJ0547 


MJ0548 


MJ0549 


MJ0550 


MJ0552 


MJ0553 


MJ0554 


MJ0555 


MJ0556 


MJ0558 


MJ0559 


MJ0560 


MJ0561 


MJ0562 


MJ0563 


MJ0564 









10 The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 

short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 1642 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
15 as a RNA single strand that is 3'UTR to the gene MJ1602 and has the DNA sequence 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
TGAATTATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAA 
ArrTCTCTAACAAATAAGTTAAATTITTGGATrrAAAAAGATAAAAATACT 
20 CTGTTTTATTATGGAAAGAAAGAT 

The match between the Tl sequence and the C1/C2 sequence is 

AAGTAAATAAAATTTCTCTAACAAATAAGTTAAATT 

25 

The match between the T2 sequence and the C1/C2 sequence is 

TAAATAAAATTTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATA 
AAAAT 

30 
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A double stranded DNA loop of length 14.509 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 1139. This Tl control 
element has the DNA sequence 

5 ATTTATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTTAGCTGG 
TTTGATTGTTTAAAATATTTGAGTTTA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 1 159. This T2 control element has the DNA sequence 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
TGAATTATTCAGATTTTTAAAAATTA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

MJ1096 MJ1097 tRNA-Arg-3 MJ1098 MJ1099 MJllOO MJllOl 
MJ1102 MJ1103 MJ1104 MJ1105 MJ1106 MJ1107 MJ1108 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 1642 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene MJ1602 and has the DNA sequence 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
TGAATTATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAA 

ATTTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACT 
30 CTGTTTTATTATGGAAAGAAAGAT 

The match between the Tl sequence and the C1/C2 sequence is 



10 



15 



20 



25 
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ATTTAATTTCTAAGGGTTAGCTGGTTTGATT 

The match between the T2 sequence and the C1/C2 sequence is 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
TGAATTATTCAGATTTTTAAAAATTA 



A double stranded DNA loop of length 4.998 kilo-bases on chromosome 1 is bounded 
on the left by a Tl sequence whose identifier is 1630. This Tl control element has 
the DNA sequence 

rrATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTTTGCTGGTTT 
GATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAAAAATTA 

AGATTAATTAGGAAAGGAAATAAGATTTCTCTAACAGACAAGTTAAATTT 
TTGGATTTAAAAAGATAAAAAT 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 1643. This T2 control element has the DNA sequence 

TTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTG 

AATTATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAAT 

TTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCT 
GTTTTATTATGGAAAGAAAGAT 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

MJ1597 MJ1598 MJ1599 MJ1600 MJ1601 MJ1602 
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The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 1642 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene MJ1602 and has the DNA sequence 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 

TGAATTATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAA 

ATTTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACT 
CTGTTTTATTATGGAAAGAAAGAT 

The match between the Tl sequence and the C1/C2 sequence is 

GCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTAnCAGATTTTTA 
AAAATTA 

The match between the T2 sequence and the C1/C2 sequence is 

TTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTG 

AATTATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAAT 

TTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCT 
GTTTTATTATGGAAAGAAAGAT 



Example of a one-to-many connectron in single-cell eukaryotes - S. cervesiae 

In this example the existence of T1-T2 (158-171, 293-317, 4295-4308 and 5916- 
5923) long loops are controlled by one C1/C2 short loop (86). 

86 Chromosome 1 
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I Chromosome 1 

158 171 



86 Chromosome 1 



10 I Chromosome 1 

293 317 



15 

86 Chromosome I 

* * s}: 

□ 20 I Chromosome 10 | 

5 4295 4308 

Hi 86 Chromosome 1 

S 25 I 

* * * 

Ijl I Chromosome 13 | 

5916 5923 

111 30 

A double stranded DNA loop of length 20.391 kilo-bases on chromosome 2 is 
^^"^ bounded on the left by a Tl sequence whose identifier is 158. This Tl control 

element has the DNA sequence 

35 

CCAATTGTTGGAATAAAAATCAACTATCATCTACTAACTAGTATTTACGTT 
ACTAGTATATTATCATATACGGTGTTAGAAGATGACGCAAATGATGAGAA 
ATAGTCATCTAAATTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAAT 
AG 

40 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 171. This T2 control element has the DNA sequence 
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ATAATTGTTGGAATAAAAATCAACTATCATCTACTAACTAGTATTTACGTT 
ACTAGTATATTATCATATACGGTGTTAGAAGATGACACAAATGATGAGAA 
ATAGTCATCTAAATTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAAT 
AGGATCAATGAATATTAACATATAAAATGATGATAATAATA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

YBL107W-A TL(UAA)B1 YBL107C YBL106C YBL105C YBL104C 
YBL103C YBL102W YBLIOIC 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C 1/C2 short loop on chromosome 1 whose identifier is 86 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene YAR009C and has the DNA sequence 

ATCTATTACATTATGGGTGGTATGTTGGAATAGAAATCAACTATCATCTAC 

TAACTAGTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATG 

ACGCAAATGATGAGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCA 

AGGATTGATAATGTAATAGGATCAATGAATATAAACATATAAAACGGAAT 

GAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGG 

ATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTATACCTAATATT 

ATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTCACCCAT 
TTCTCAGAA 

The match between the Tl sequence and the C1/C2 sequence is 
AAATCAACTATCATCTACTAACTAGTATTTAC 
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The match between the T2 sequence and the C1/C2 sequence is 

AAATCAACTATCATCTACTAACTAGTATTTAC 
5 

A double stranded DNA loop of length 38.470 kilo-bases on chromosome 2 is 
bounded on the left by a Tl sequence whose identifier is 293. This Tl control 
element has the DNA sequence 

10 

GAATTGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTA 

TCAATATATTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAA 

GCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAAT 

AGGATAATGAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGT 

1 5 ATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTTGAGGAGAAC 
TTCTAGT 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3 1 7. This T2 control element has the DNA sequence 

20 

AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTC 
GAGGAGAACTTCTAGTATATTCTGTA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
25 following genes 

YBL005W-B TS(AGA)B YBL004W YBL003C YBL002W YBLOOIC 
YBROOIC YBR002C YBR003W YBR004C YBR005W YBR006W 
YBR007C YBR008C YBR009C YBROlOW YBROllC YBR012C 

30 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 
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A C1/C2 short loop on chromosome 1 whose identifier is 86 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene YAR009C and has the DNA sequence 

ATCTATTACATTATGGGTGGTATGTTGGAATAGAAATCAACTATCATCTAC 

TAACTAGTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATG 

ACGCAAATGATGAGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCA 

AGGATTGATAATGTAATAGGATCAATGAATATAAACATATAAAACGGAAT 

GAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGG 

ATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTATACCTAATATT 

ATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTCACCCAT 
TTCTCAGAA 

The match between the Tl sequence and the C1/C2 sequence is 

AAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAAT 
ATAGATTCCATTTTGAGGATTCCTATATCCT 

The match between the T2 sequence and the C 1/C2 sequence is 

AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTC 
GAGGAGAACTTCTAGTATATTCTGTA 



A double stranded DNA loop of length 11.020 kilo-bases on chromosome 10 is 
bounded on the left by a Tl sequence whose identifier is 4295. This Tl control 
element has the DNA sequence 
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AAACGCAAGGATTGATAATGTAATAGGATCAATGAATATAAACATATAAA 

ACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCAT 

TTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTG 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 4308. This T2 control element has the DNA sequence 

GGAAGCTGAAACGCAAGGATTGATAATGTAATAGGATCAATGAATATAA 
ACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATAT 
AGATTCCATTTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATATT 
CTGTATACCTAATATTATAGCCTTTATCAA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

YJR027W YJR029W 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 87 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene YAR009C and has the DNA sequence 

ATCTATTACATTATGGGTGGTATGTTGGAATAGAAATCAACTATCATCTAC 

TAACTAGTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATG 

ACGCAAATGATGAGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCA 

AGGATTGATAATGTAATAGGATCAATGAATATAAACATATAAAACGGAAT 

GAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGG 

ATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTATACCTAATATT 

ATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTCACCCAT 
TTCTCA 
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A double stranded DNA loop of length 5.462 kilo-bases on chromosome 13 is 
bounded on the left by a Tl sequence whose identifier is 5916. This Tl control 
element has the DNA sequence 

AAGCTGAAGTGCAAGGATTGATAATGTAATAGGATAATGAAACATATAA 

AACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCA 

TTTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 5923. This T2 control element has the DNA sequence 

TAATAGGATAATGAAACATATAAAACGGAATGAGGAATAATCGTAATATT 
AGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAG 
AACTTCTAGTATATTCTGTATACCTAATATTATAGCCTTTATCAA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

YML045W 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops, 

A C1/C2 short loop on chromosome 1 whose identifier is 87 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene YAR009C and has the DNA sequence 

ATCTATTACATTATGGGTGGTATGTTGGAATAGAAATCAACTATCATCTAC 
TAACTAGTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATG 



- 102- 



ACGCAAATGATGAGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCA 

AGGATTGATAATGTAATAGGATCAATGAATATAAACATATAAAACGGAAT 

GAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGG 

ATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTATACCTAATATT 

ATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTCACCCAT 
TTCTCA 



Example of a one-to-many connectron in multi-cell eukaryotes - C. elegans 

In this example the existence of T1-T2 (16554-16661 and 21565-21590) long loops 
are controlled by one C1/C2 short loop (21591). 



21591 Chromosome 5 



I Chromosome 4 | 

16554 16661 



21591 Chromosome 5 



* 

Chromosome 5 
21565 21590 



A double stranded DNA loop of length 50.159 kilo-bases on chromosome 4 is 
bounded on the left by a Tl sequence whose identifier is 16554. This Tl control 
element has the DNA sequence 
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TGCCTGAAAAAATTGGCTCCGAGTTAGGACACTTGGGGTGGTCAAAAAAT 
TTTGTGACTATTGTCAAATGAAAGATCATAGTTGATAACATAAATTCCCAA 
AGTTTCATAAAAATCGATACGCAGCGAACAAAGTTATCAATT 

5 This double stranded DNA loop is bounded on the right by a T2 control element 

whose identifier is 16661. This T2 control element has the DNA sequence 

CACTTGGGGTGGTCAAAAAATTTTGTGATTATTGTCAAATGAAAGATCAT 
GGTTGATAACATAAATTCCCAAAGTTTCATAAAAATCGATACGCAGCGAA 
10 CAAAGTTATGATTTTTGACCCGGAACTTATTTGGAGACCTA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

15 C23H5.7 C23H5.8a C23H5.3 C23H5.2 C23H5.9 C23H5.1 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

20 A C1/C2 short loop on chromosome 5 whose identifier is 21591 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene F25A2.1 and has the DNA sequence 

TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCACAATTTCAT 
25 AAAAATCGATACGCAGCGAACAAAGTTATGATTTTTGACCCGGAACTTAT 
TTGGAGACCTAATATT 

The match between the Tl sequence and the C1/C2 sequence is 
30 TTTCATAAAAATCGATACGCAGCGAACAAAGTTAT 
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The match between the T2 sequence and the C1/C2 sequence is 

TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCA 
5 

A double stranded DNA loop of length 18.142 kilo-bases on chromosome 5 is 
bounded on the left by a Tl sequence whose identifier is 21565. This Tl control 
element has the DNA sequence 

10 

CTCCGAGTTAGGACACTTGGGGTGGACAAAAAATTTTGTGACTATTGTCA 
AATGAAAGATCATGGTTGATAA 

This double stranded DNA loop is bounded on the right by a T2 control element 
15 whose identifier is 21590. This T2 control element has the DNA sequence 

TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCACAATTTCAT 

AAAAATCGATACGCAGCGAACAAAGTTATGATTTTTGACCCGGAACTTAT 
TTGGAGACCTAATA 

20 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

T21H3.2 T21H3,1 F25A2.1 

25 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 5 whose identifier is 21591 controls the 
30 expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 

as a RNA single strand that is 3'UTR to the gene F25A2.1 and has the DNA sequence 
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TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCACAATTTCAT 

AAAAATCGATACGCAGCGAACAAAGTTATGATTTTTGACCCGGAACTTAT 
TTGGAGACCTAATATT 

The match between the Tl sequence and the C1/C2 sequence is 

TATTGTCAAATGAAAGATCATGGTTGATAA 

The match between the T2 sequence and the C1/C2 sequence is 

TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCACAATTTCAT 

AAAAATCGATACGCAGCGAACAAAGTTATGATTTTTGACCCGGAACTTAT 
TTGGAGACCTAATA 
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4. Connectrons occur between prokaryotes and their plasmids. 

Connectron relationships exist between prokaryotes and their plasmids. These 
connectrons implement a control mechanism between the two genomes that makes it 
possible for them to form a symbiotic relationship. In the case of D. radiodurans the 
relationship is not symmetric. The D. radiodurans genome sends C1/C2 short loops to 
the MPl plasmid. 



Example of a prokaryote/plasmid connectron - D. radiodurans 



In this example the existence of T1-T2 (2654-2694 and 2692-2749) long loops in 
chromosome 3 that is the plasmid MPl are controlled by one C1/C2 short loop (16) in 
chromosome 1. 



16 Chromosome 1 

2768 Chromosome 3 (plasmid MPl) 

2653 Chromosome 3 (plasmid MPl) 



* * * 

I Chromosome 3 (plasmid MPl) 
2654 2694 
I 2693 I 



16 Chromosome 1 

2768 Chromosome 3 (plasmid MPl) 

2693 Chromosome 3 (plasmid MPl) 



I Chromosome 3 (plasmid MPl) 
2692 2749 
I 2693 2695 



A double stranded DNA loop of length 46.903 kilo-bases on chromosome 3 (plasmid 
MPl) is bounded on the left by a Tl sequence whose identifier is 2654. This Tl 
control element has the DNA sequence 
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CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCG 

GTATGCAGCCTGCTCGGAGAGTACGATTCGTCGTTGGCTGCACCGAAGTG 

ACGATGGGGCCATTCCGTGGGGCGCGTTACACCAGGCGACTGTCAGTACA 

GCAATCGAGAGTGGGCTGATCAGCCCACTGTGCGTTCTGGCCATCGACGC 

CTCTTTTCACCGCAAAGCCGGTCAGCACACCGCACACCTCGGCTCGTTCTG 
GAATGGCTGTGCCGCGCGGACC 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 2694. This T2 control element has the DNA sequence 

GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGA 

TTCGTCGTTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCGCG 

TTACACCAGGCGACTGTCAGTACAGCAATCGAGAGTGGGCTGATCAGCCC 

ACTGTGCGTTCTGGCCATCGACGCCTCTTTTCACCGCAAAGCCGGTCAGCA 

CACCGCACACCTCGGCTCGTTCTGGAATGGCTGTGCCGCGCGGACCGAAC 
GCGGAATCGAGCAATCCTGTTGT 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 



DRB0020 DRB0021 

DRB0027 DRB0030 

DRB0037 DRB0038 

DRB0044 DRB0045 

DRB0055 DRB0057 



DRB0022 DRB0023 

DRB0032 DRB0033 

DRB0039 DRB0041 

DRB0047 DRB0051 



DRB0024 DRB0025 

DRB0034 DRB0035 

DRB0042 DRB0043 

DRB0052 DRB0054 



This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 3 (plasmid MPl) whose identifier is 2693 
controls the expression of the genes of one or more other T1/T2 long loops. This 
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C1/C2 short loop is expressed as a RNA single strand that is 3'UTR to the gene 
DRB0057 and has the DNA sequence 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGAC 

GCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGG 
AC 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 16 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene DR0009 and has the DNA sequence 

GCTGTGAAATCACCGCTTCCAATGGGTCTGATGGCCATCCTACAGTACGTT 
CTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCT 
GCTCAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTC 
CCGGTATGCAGCCTGCTCGGAGAGTACGATTCGT 



The match between the Tl sequence and the C1/C2 sequence is 

CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCG 
GTATGCAGCCTGCTCGGAGAGTACGATTCGTCGTTGGCTGCACCGAAGTG 
ACGATGGGGCCATTCCGTGGGGCGCGTTACACCAGGCGACTGTCAGTACA 
GCAATCGAGAGTGGGCTGATCAGCCCACTGTGCGTTCTGGCCATCGACGC 

CTCTTTTCACCGCAAAGCCGGTCAGCACACCGCACACCTCGGCTCGTTCTG 
GAATGGCTGTGCCGCGCGGACC 

The match between the T2 sequence and the C 1/C2 sequence is 



- 109- 



GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGA 

TTCGTCGTTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCGCG 

TTACACCAGGCGACTGTCAGTACAGCAATCGAGAGTGGGCTGATCAGCCC 

ACTGTGCGTTCTGGCCATCGACGCCTCTTTTCACCGCAAAGCCGGTCAGCA 

CACCGCACACCTCGGCTCGTTCTGGAATGGCTGTGCCGCGCGGACCGAAC 
GCGGAATCGAGCAATCCTGTTGT 

A C1/C2 short loop on chromosome 3 (plasmid MPl) whose identifier is 2768 
controls the expression of the genes in this T1/T2 long loop. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene DRB0133 and has the 
DNA sequence 

GCTGTGAAATCACCGCTTCCAATGGGTCTGATGGCCATCCTACAGTACGTT 
CTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCT 

GCTCAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTC 
CCGGTATGCAGCCTGCTCGGAGAGTACGATTCGT 



The match between the TI sequence and the C1/C2 sequence is 

CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCG 

GTATGCAGCCTGCTCGGAGAGTACGATTCGTCGTTGGCTGCACCGAAGTG 

ACGATGGGGCCATTCCGTGGGGCGCGTTACACCAGGCGACTGTCAGTACA 

GCAATCGAGAGTGGGCTGATCAGCCCACTGTGCGTTCTGGCCATCGACGC 

CTCTTTTCACCGCAAAGCCGGTCAGCACACCGCACACCTCGGCTCGTTCTG 
GAATGGCTGTGCCGCGCGGACC 

The match between the T2 sequence and the C1/C2 sequence is 

GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGA 
TTCGTCGTTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCGCG 
TTACACCAGGCGACTGTCAGTACAGCAATCGAGAGTGGGCTGATCAGCCC 
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ACTGTGCGTTCTGGCCATCGACGCCTCTTTTCACCGCAAAGCCGGTCAGCA 

CACCGCACACCTCGGCTCGTTCTGGAATGGCTGTGCCGCGCGGACCGAAC 
GCGGAATCGAGCAATCCTGTTGT 

A C1/C2 short loop on chromosome 3 (plasmid MPl) whose identifier is 2653 
controls the expression of the genes in this T1/T2 long loop. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene DRB0017 and has the 
DNA sequence 

CGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGC 
GTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTAT 

GCAGCCTGCTCGGAGAGTACGATTCGTCGTTGGCTGCACCGAAGTGACGA 
TGGGGCCATTCCGTGGGGCGCGTTACACCAGGCGA 

The match between the Tl sequence and the C1/C2 sequence is 

CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCG 

GTATGCAGCCTGCTCGGAGAGTACGATTCGTCGTTGGCTGCACCGAAGTG 

ACGATGGGGCCATTCCGTGGGGCGCGTTACACCAGGCGACTGTCAGTACA 

GCAATCGAGAGTGGGCTGATCAGCCCACTGTGCGTTCTGGCCATCGACGC 

CTCTTTTCACCGCAAAGCCGGTCAGCACACCGCACACCTCGGCTCGTTCTG 
GAATGGCTGTGCCGCGCGGACC 

The match between the T2 sequence and the C1/C2 sequence is 

GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGA 

TTCGTCGTTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCGCG 

TTACACCAGGCGACTGTCAGTACAGCAATCGAGAGTGGGCTGATCAGCCC 

ACTGTGCGTTCTGGCCATCGACGCCTCTTTTCACCGCAAAGCCGGTCAGCA 

CACCGCACACCTCGGCTCGTTCTGGAATGGCTGTGCCGCGCGGACCGAAC 
GCGGAATCGAGCAATCCTGTTGT 
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A double stranded DNA loop of length 68.612 kilo-bases on chromosome 3 (plasmid 
MPl) is bounded on the left by a Tl sequence whose identifier is 2692. This Tl 
control element has the DNA sequence 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGAC 

GCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGG 
AC 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 2749. This T2 control element has the DNA sequence 

AGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCT 

CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCG 
GT 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
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DRB0098 
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This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 
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A C1/C2 short loop on chromosome 3 (plasmid MPl) whose identifier is 2693 
controls the expression of the genes of one or more other T1/T2 long loops. This 
C1/C2 short loop is expressed as a RNA single strand that is 3'UTR to the gene 
DRB0057 and has the DNA sequence 

5 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGAC 

GCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGG 
AC 

10 A C1/C2 short loop on chromosome 3 (plasmid MPl) whose identifier is 2695 

controls the expression of the genes of one or more other T1/T2 long loops. This 
C1/C2 short loop is expressed as a RNA single strand that is 3'UTR to the gene 
DRB0057 and has the DNA sequence 

15 GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGA 

TTCGTCGTTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCGCG 

TTACACCAGGCGACTGTCAGTACAGCAATCGAGAGTGGGCTGATCAGCCC 

ACTGTGCGTTCTGGCCATCGACGCCTCTTTTCACCGCAAAGCCGGTCAGCA 

CACCGCACACCTCGGCTCGTTCTGGAATGGCTGTGCCGCGCGGACCGAAC 
20 GCGGAATCGAGCAATCCTGTTGT 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

25 A C1/C2 short loop on chromosome 1 whose identifier is 16 controls the expression 

of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene DR0009 and has the DNA sequence 

GCTGTGAAATCACCGCTTCCAATGGGTCTGATGGCCATCCTACAGTACGTT 
30 CTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCT 
GCTCAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTC 
CCGGTATGCAGCCTGCTCGGAGAGTACGATTCGT 
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The match between the Tl sequence and the C1/C2 sequence is 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGAC 

GCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGG 
AC 

The match between the T2 sequence and the C1/C2 sequence is 

AGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCT 

CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCG 
GT 

A C1/C2 short loop on chromosome 3 (plasmid MPl) whose identifier is 2768 
controls the expression of the genes in this T1/T2 long loop. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene DRB0133 and has the 
DNA sequence 

GCTGTGAAATCACCGCTTCCAATGGGTCTGATGGCCATCCTACAGTACGTT 

CTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCT 

GCTCAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTC 

CCGGTATGCAGCCTGCTCGGAGAGTACGATTCGT...CGGACCGAACGCGGA 

ATCGAGCAATCCTGTTGTGCCCTCATTGATGTCCAGCACCGGCAGGCCTTG 

ACGGTCGATGTCCGTCAGACCCTGACCGGGTCTGAGGCTCCAACTCGTCT 
GGAACAG 

The match between the Tl sequence and the C1/C2 sequence is 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGAC 

GCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGG 
AC 
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The match between the T2 sequence and the C1/C2 sequence is 

AGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCT 

CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCG 
5 GT 

A C1/C2 short loop on chromosome 3 (plasmid MPl) whose identifier is 2693 
controls the expression of the genes in this T1/T2 long loop. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene DRB0057 and has the 
1 0 DNA sequence 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGAC 

GCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGG 
AC 

15 

The match between the Tl sequence and the C1/C2 sequence is 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGAC 

GCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGG 
20 AC 

The match between the T2 sequence and the C1/C2 sequence is 

AGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCT 
25 CAGCGTTTTTCTCGCTGTTCCTGGAC 

A C1/C2 short loop on chromosome 3 (plasmid MPl) whose identifier is 2653 
controls the expression of the genes in this T1/T2 long loop. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene DRB00I7 and has the 
30 DNA sequence 
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CGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGC 

GTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTAT 

GCAGCCTGCTCGGAGAGTACGATTCGTCGTTGGCTGCACCGAAGTGACGA 
TGGGGCCATTCCGTGGGGCGCGTTACACCAGGCGA 

The match between the Tl sequence and the C1/C2 sequence is 

CGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGC 
GTTTTTCTCGCTGTTCCTGGAC 

The match between the T2 sequence and the C1/C2 sequence is 

CGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGC 
GTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGT 
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5. 



Connectrons occur in plants and higher animals 



Connectron relationships exist in plant and higher animals. 
Example of a plant connectron - A. thaliania 

In this example the existence of the T1-T2 (423-469) long loop is controlled by six 
C1/C2 short loops (972, 21396, 422, 21762, 21813 and 10882). The T1-T2 long loop 
controls the expression of six genes on chromosome 2 in addition to two C1/C2 (426 
and 430) short loops. 



972 Chromosome 2 
21396 Chromosome 4 
422 Chromosome 2 
21762 Chromosome 4 
21813 Chromosome 4 
10882 Chromosome 4 

* * * 

I Chromosome 2 | 

423 469 
i 426 430 



A double stranded DNA loop of length 42.285 kilo-bases on chromosome 2 is 
bounded on the left by a Tl sequence whose identifier is 423. This Tl control 
element has the DNA sequence 

TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTA 
ATTAAAAAACGAAATA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 469. This T2 control element has the DNA sequence 
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TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATTATTAATT 
TTCAAAAATAATAACC 

This long T1/T2 double stranded DNA loop modulates the expression of the 
5 following genes 

At2g02070 At2g02080 At2g02090 At2g02100 At2g02120 At2g02130 

This long T1/T2 double stranded DNA loop modulates the expression of the 
1 0 following C 1 /C2 short loops 

A C1/C2 short loop on chromosome 2 whose identifier is 426 controls the expression 
of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene At2g02060 and has the 
15 DNA sequence 

TTCCAAAAATAATAACCAATCAAAATCAACATATAAGATTTGATATCTAA 
ATTTT 

20 A C1/C2 short loop on chromosome 2 whose identifier is 430 controls the expression 

of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene At2g02060 and has the 
DNA sequence 

25 TTGCGGAAAAATAATATCATCATTATAAAAAAATAATTAGAGTTTTTTCGC 
ATAT 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

30 
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A C1/C2 short loop on chromosome 2 whose identifier is 972 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene At2g04240 and has the DNA sequence 

5 GTATGCCATTAGAAATAAAATTTTAAAAGTAAATTAATTCATCTCTTTAAA 

AATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACG 
AAATACATTATTAATTT 

The match between the Tl sequence and the C1/C2 sequence is 

10 

ATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACGA 
AATA 

The match between the T2 sequence and the C1/C2 sequence is 

15 

TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATTATTAATT 
T 

A C1/C2 short loop on chromosome 4 whose identifier is 21396 controls the 
20 expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 

as a RNA single strand that is 3'UTR to the gene AT4gl5300 and has the DNA 
sequence 

TGCCATTAGAAATAAAATTTTAAAGAGTAAATTAATTTATCTCTTTAAGGA 
25 TTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACGAA 
ATACATTATTAATTTCCAAAA 

The match between the Tl sequence and the C1/C2 sequence is 

30 TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTA 
ATTAAAAAACGAAATA 
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The match between the T2 sequence and the C1/C2 sequence is 

TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATTATTAATT 
T 

5 

A C1/C2 short loop on chromosome 2 whose identifier is 422 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene At2g02060 and has the DNA sequence 

1 0 TA ACCTTAATTTTTGTAAGTAATTATATAGGTATGCC ATTAGAAATAAAAT 

TTTAAAGAGTAAATTAATTTATCTCTTTAAGGATTAAAAAGTCAAATACTA 
ATTTAATTAATTAAATTTAATTAAAAAACGAAATA 

The match between the Tl sequence and the C1/C2 sequence is 

15 

TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTA 
ATTAAAAAACGAAATA 

The match between the T2 sequence and the C1/C2 sequence is 

20 

TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATA 

A C1/C2 short loop on chromosome 4 whose identifier is 21762 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
25 as a RNA single strand that is 3'UTR to the gene AT4gl7510 and has the DNA 

sequence 

TTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAA 
AAACGAAATACATT 

30 

The match between the Tl sequence and the C1/C2 sequence is 
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TTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAA 
AAACGAAATA 

The match between the T2 sequence and the C1/C2 sequence is 

5 

TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATT 

A C1/C2 short loop on chromosome 4 whose identifier is 21813 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
10 as a RNA single strand that is 3'UTR to the gene AT4g 17680 and has the DNA 

sequence 

TTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAA 
AAACGAAATACATT 

CO 15 

'fl^ The match between the Tl sequence and the C1A:2 sequence is 

TTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAA 
l_ AAACGAAATA 

S 20 

The match between the T2 sequence and the C1/C2 sequence is 
TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATT 

25 A C1/C2 short loop on chromosome 2 whose identifier is 10882 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene At2g26540 and has the DNA 
sequence 

30 TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTA 
ATTAA 
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The match between the Tl sequence and the C1/C2 sequence is 

TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTA 
ATTAA 

The match between the T2 sequence and the C1/C2 sequence is 
TACTAATTTAATTAATTAAATTTAATTAA 



Example of a animal connectron - D. megalomaster 

A double stranded DNA loop of length 88.159 kilo-bases on chromosome 4 is 
bounded on the left by a Tl sequence whose identifier is 3340. This Tl control 
element has the DNA sequence 

ACCTAAAAGAAGTACCGTTTTTTACTCCTAATTACCAATTCTAACCATCCA 

TATCACTTTTTGACGGACTCCGTGAAAATAATTTTTGGCCAAATTTTCGCA 
TTTTTTGTAAGGGGTAACATCATAAAAATT 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3372. This T2 control element has the DNA sequence 

AAAAAAGTACCGCGTTTTACTCCTAATTACCAATTCTAACCATCCATATCA 
CTTTTTGACGGACTCCGTGAAAATAATTTTTGGCCAAATTTTCGCATTmr 
GTAAGGGGTAACATCATCAAAATTTGCGAAAAA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

[Some of the following gene names have not been determined.] 
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CGI 1207 - CG2186 CG2157 
Orkl 

5 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

10 A C1/C2 short loop on chromosome 4 whose identifier is 3362 controls the 

expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene XXX and has the 
DNA sequence 

1 5 AAAAAAGTACCGCGTTTTACTCCTAATTACCAATTCTAACCATCCATATCA 
CTTTTTGACGGACTCCGTTAAAATAATTTrrGACCAAATTTTCGCATTTTTT 
GTAATCAAAATTTGCAAAAAATTGAAAAAAC 

A C1/C2 short loop on chromosome 4 whose identifier is 3364 controls the 
20 expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 

loop is expressed as a RNA single strand that is 3'UTR to the gene XXX and has the 
DNA sequence 

CAAAATTTGAATGCAAATCGATTGGGAATCAAAAAACAAACTCAACGAG 
25 GTATGACATTCCATATTTGGGCCATTATTTCCAA 

A C1/C2 short loop on chromosome 4 whose identifier is 3366 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene XXX and has the 
30 DNA sequence 
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TTTTTTCACAAAAATTAGGAAAATGATTTTGGGTAAAAAAATGAATATTT 
AAGTTGGGTTTT 

A C1/C2 short loop on chromosome 4 whose identifier is 3369 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene XXX and has the 
DNA sequence 

AAATCGATTGGGAATCAAAAAACAAACCTCAACGAGGTATGACATTCCAT 
ATCTGGGCCATTATTTCCAATCTTTTGATCAAAATAC 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 4 whose identifier is 3373 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene XXX and has the DNA sequence 

AAAAAAGTACCGCGTTTTACTCCTAATTACCAATTCTAACCATCCATATCA 
CTTTTTGACGGACTCCGTGAAAATAATTTTTGGCCAAATTTTCGCATTTTTT 
GTAAGGGGTAACATCATCAAAATTTGCGAAAAA 

The match between the Tl sequence and the C1/C2 sequence is 

TTTTACTCCTAATTACCAATTCTAACCATCCATATCACTTTTTGACGGACTC 

CGTGAAAATAATTTTTGGCCAAATTTTCGCATTTTTTGTAAGGGGTAACAT 
CAT 

The match between the T2 sequence and the C1/C2 sequence is 
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AAAAAAGTACCGCGTTTTACTCCTAATTACCAATTCTAACCATCCATATCA 

CTTTTTGACGGACTCCGTGAAAATAATTTTTGGCCAAATTTTCGCATTT^ 

GTAAGGGGTAACATCATCAAAATTTGCGAAAAA 



Example of an animal connectron - H. sapiens 

All of the human genome that has been fully sequenced by both the NIH-lead global 
sequencing project and the Celera Genomics, Inc. project. The gene descriptors for 
this chromosome do not yet exist. Without the positions and directions of the genes, 
it is not possible to select from among the possible connectrons to determine the real 
connectrons. 

Human chromosome 22 has been processed and there 31,000 possible connectrons. 

The gene descriptors for all the chromosomes of the human genome should become 
available within the year. 
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6. Permanent connectrons exist in prokaryotes, archea, single-celled eukaryotes 
and multi-celled eukaryotes, 

C1/C2 short loops are normally expressed as the 3'UTR of some gene. A class of 
5 connectron relationships exist that permit one C1/C2 short loop to control the 

existence of one or more T1-T2 long loops without being subject to any expression 
controls other than those of the gene to which the C1/C2 is 3'UTR. These connectron 
relationships are described as "permanent". Permanent connectrons exist in 
prokaryotes, archea, single-celled eukaryotes and multi-celled eukaryotes. 

10 

Example of a prokaryote permanent connectron - E. coli 

In this example the existence of the T1-T2 (3200-3210) long loop is controlled by a 
C1/C2 short loop (3432). The expression of this C1/C2 short loop is controlled only 
1 5 by the gene btuB. 

3432 Chromosome I 

* * * 

20 I Chromosome 1 [ 

3200 3210 



25 A double stranded DNA loop of length 93.339 kilo-bases on chromosome 1 is 

bounded on the left by a Tl sequence whose identifier is 3200. This Tl control 
element has the DNA sequence 

AAGCGGCACTGCTCTTTAACAATTTATCAGACAATCTGTGTGGGCACTCG 
30 AAGATACGGATTCTTAACGTCGCAAGACGAAAAATGAATACCAAGTCTCA 
AGAGTGAACACGTAATTCATTACGAAGTTTAATTCTTTGAGCATCAAACTT 
TTAAATTGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCC 
TAACACATGCAAGTCGAACGGTAACAGGAAACAGCTTGCTGTTTCGCTGA 
CGAGTGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGG 
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GATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAA 
GAGGGGGACCTTCGGGCCTCTTGCCATC 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3310. This T2 control element has the DNA sequence 

CAGACAATCTGTGTGGGCACTCGAAGATACGGATTCTTAACGTCGCAAGA 

CGAAAAATGAATACCAAGTCTCAAGAGTGAACACGTAATTCATTACGAAG 

TTTAATTCTTTGAGCGTCAAACTTTTAAATTGAAGAGTTTGATCATGGCTC 

AGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACA 

GGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGAGTAATGTC 

TGGGAAACTGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAAT 

ACCGCATAACGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCTCTTGCCA 

TCGGATGTGCCCAGATGGGATTAGCTAGT 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
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The expression of genes in this TI/T2 long loop is controlled by the following C1/C2 
short loops. 
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A C1/C2 short loop on chromosome 1 whose identifier is 3432 controls the 
expression of the genes in this TI/T2 long loop. This C1/C2 short loop is expressed as 
a RNA single strand that is 3'UTR to the gene btuB and has the DNA sequence 

5 TGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGG 
GTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATG 
CTTGACTCTGTAGCGGGAAGGCGTATTATGC AC ACC . . .TGCAACTCGACTC 
CATGAAGTCGGAATCGCTAGTAATCGTGGATCAGAATGCCACGGTGAATA 
1 0 CGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGC 
AAAAGAAGTAGGTAGCTTAACCTTCGGGAGGGCGCTTACCACTTTGTGAT 
TCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTAGGGGAACCTGCGGT 
TGGATCACCTCCTTACCTTAAAGAAGCGT 

15 The match between the Tl sequence and the C1/C2 sequence is 

AAGCGGCACTGCTCTTTAACAATTTATCAGACAATCTGTGTGGGCACTCG 

AAGATACGGATTCTTAACGTCGCAAGACGAAAAATGAATACCAAGTCTCA 

AGAGTGAACACGTAATTCATTACGAAGTTTAATTCTTTGAGC 

20 

The match between the T2 sequence and the C1/C2 sequence is 

CAGACAATCTGTGTGGGCACTCGAAGATACGGATTCTTAACGTCGCAAGA 
CGAAAAATGAATACCAAGTCTCAAGAGTGAACACGTAATTCATTACGAAG 

2 5 TTTAATTCTTTGAGCGTCAAACTTTTAAATTGAAGAGTTTGATC ATGGCTC 

AGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACA 
GGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGAGTAATGTC 
TGGGAAACTGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAAT 
ACCGCATAACGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCTCTTGCCA 

30 TCGGATGTGCCCAGATGGGATTAGCTAGT 
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Example of an archea permanent connectron - H. pylori 

In this example the existence of the T1-T2 (812-882) long loop is controlled by a 
C1/C2 short loop (1241). The expression of this C1/C2 short loop is controlled only 
by the gene HPl 535. 

1241 Chromosome 1 

* „ _* * 

I Chromosome 1 | 

812 882 



A double stranded DNA loop of length 96.385 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 812. This Tl control 
element has the DNA sequence 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 882. This T2 control element has the DNA sequence 

TAGCGGAACTAAAGCATTCATCCCAAACACTAAAGATATTTGG 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
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The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops, 

10 A C1/C2 short loop on chromosome 1 whose identifier is 1241 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene HP1535 and has the DNA sequence 

5 TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCC 

i 15 AAACA 

43 The match between the Tl sequence and the CI/C2 sequence is 

:^ TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 

m 20 

y The match between the T2 sequence and the C1/C2 sequence is 

TAGCGGAACTAAAGCATTCATCCCAAACA 
25 

Example of a single-celled permanent connectron - S, cervesiae 

In this example the existence of the T1-T2 (5515-5533) long loop is controlled by a 
30 C1/C2 short loop (6102). The expression of this C1/C2 short loop is controlled only 

by the gene YNL339C. 
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6102 Chromosome 14 



Chromosome 12 
5515 5533 



A double stranded DNA loop of length 6.466 kilo-bases on chromosome 12 is 
10 bounded on the left by a Tl sequence whose identifier is 5515. This Tl control 

element has the DNA sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAA 
AGGTAGTAAGTAGCTTTTGGTTG 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 5533, This T2 control element has the DNA sequence 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 
ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 
ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 
TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
30 

YLR467W 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 
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A C1/C2 short loop on chromosome 14 whose identifier is 6102 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YNL339C and has the DNA 
5 sequence 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTAT 
ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 
TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 
1 0 GGAAAGAGTAGG ATAAAAAG AC AATCT ATAAAA AGTAAAC ATAAAATAA 

AGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence is 

i 1 5 AGG AAATTGTTGTT ACG AAAGTC AGTGATTATGTATTGTGTAGTATAGTAT 

1^ ATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTAT 

Jj TTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTG 

5 y GGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAAC ATAAAATAA 

AGGTAGTAAGTAGCTTTTGGTTG 

ill 20 

Ly The match between the T2 sequence and the C1/C2 sequence is 

H ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGA 

ATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGC 
25 ACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATC 
TATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACA 
TCCGGGTAAGAGACAACAGGGCT 



30 

Example of a multi-celled permanent connectron - C, elegans 
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In this example the existence of the T1-T2 (5515-5533) long loop is controlled by a 
C1/C2 short loop (6102). The expression of this C1/C2 short loop is controlled only 
by the gene YNL339C. 

24442 Chromosome 5 

* * H< 

I Chromosome 1 | 

569 596 



A double stranded DNA loop of length 30.606 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 569. This Tl control 
element has the DNA sequence 

AAATCGAGCCCGTAAATCGACACAAGCGCTACAGTAGTC 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 596. This T2 control element has the DNA sequence 

AGTGCTACAGTAGTCATTTAAAGAATTACTGTAGTTTTCGCT 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 5 whose identifier is 24442 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene F20D6.4 and has the DNA sequence 

GAGCCCGTAAATCGACACAAGCGCTACAGTAGTCATTTAAAGAATTACTG 
TAGTTTTC 

The match between the Tl sequence and the C1/C2 sequence is 
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GAGCCCGTAAATCGACACAAGCGCTACAGTAGTC 
The match between the T2 sequence and the C1/C2 sequence 
GCTACAGTAGTCATTTAAAGAATTACTGTAGTTTTC 
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7. Transient connectrons exist in prokaryotes, archea, single-celled eukaryotes 
and multi-celled eukaryotes. 

A class of connectron relationships exist that permit one C1/C2 short loop to control 
the existence of one or more T1-T2 long loops such that this C1/C2 short loop is itself 
subject to expression control by another T1-T2 long loop which surrounds it. These 
connectron relationships are described as "transient". Transient connectrons exist in 
prokaryotes, archea, single-celled eukaryotes and multi-celled eukaryotes. 

Example of a prokaryote transient connectron - E. coli 

In this example the existence of the T1-T2 (3227-3329) long loop is controlled by the 
C1/C2 (3225) short loop. The expression of this C1/C2 short loop is controlled by the 
existence of the T1-T2 (3216-3224) long loop. The existence of this T1-T2 long loop 
is itself determined by the expression of the C1/C2 (3223) short loop. The C1/C2 
(3225) short loop is the transient connectron. 

3223 Chromosome 1 

* * * 

j Chromosome 1 | 

3216 3324 
I 3225 I 

3225 Chromosome 1 

I 

* * ^ 

1 Chromosome 1 | 

3227 3329 



A double stranded DNA loop of length 93.464 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 3216. This Tl control 
element has the DNA sequence 
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AGCGCAAGCGAAGCTCTTGATCGAAGCCCCGGTAAACGGCGGCCGTAACT 
ATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCTGC 
ACGAATGGCGTAATGATGGCCAGGCTGTCTCCACCCGAGACTCAGTGAAA 
TTGAACTCGCTGTGAAGATGCAGTGTACCCGCGGCAAGACGGAAAGACCC 
5 CGTGAACCTTTACTATAGCTTGACACTGAACATTGAGCCTTGATGTGTAGG 
ATAGGTGGGAGGCTTTGAAGTGTGGACGCCAGTCTGCATGGAGCCGACCT 
TGAAATACCACCCTTTAATGTTTGATGTTCTAACGT 

This double stranded DNA loop is bounded on the right by a T2 control element 
10 whose identifier is 3324. This T2 control element has the DNA sequence 

CCCGGTAAACGGCGGCCGTAACTATAACGGTCCTAAGGTAGCGAAATTCC 
TTGTCGGGTAAGTTCCGACCTGCACGAATGGCGTAATGATGGCCAGGCTG 
TCTCCACCCGAGACTCAGTGAAATTGAACTCGCTGTGAAGATGCAGTGTA 
1 5 CCCGCGGCAAGACGGAAAGACCCCGTGAACCTTTACTATAGCTTGACACT 
GAACATTGAGCCTTGATGTGTAGGATAGGTGGGAGGCTTTGAAGTGTGGA 
CGCCAGTCTGCATGGAGCCGACCTTGAAATACCACCCTTTAATGTTTGATG 
TTCTAACGTTGACCCGTAATCCGGGTTGCGGACAGT 

20 This long T1/T2 double stranded DNA loop modulates the expression of the 

following genes 
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This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 3225 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene rrlC and has the 
DNA sequence 

AAACAGAATTTGCCTGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCAT 

GCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTC 
CCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATTA 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 3323 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene rrlA and has the DNA sequence 

GCGAAGCTCTTGATCGAAGCCCCGGTAAACGGCGGCCGTAACTATAACGG 

TCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCTGCACGAATG 

GCGTAATGATGGCCAGGCTGTCTCCACCCGAGACTCAGTGAAATTGAACT 

CGCTGTGAAGATGCAGTGTACCCGCGGCAAGACGGA...AACAGAA1TTGC 

CTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAA 

GTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTC 

The match between the Tl sequence and the C1/C2 sequence is 

GCGAAGCTCTTGATCGAAGCCCCGGTAAACGGCGGCCGTAACTATAACGG 

TCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCTGCACGAATG 

GCGTAATGATGGCCAGGCTGTCTCCACCCGAGACTCAGTGAAATTGAACT 
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CGCTGTGAAGATGCAGTGTACCCGCGGCAAGACGGAAAGACCCCGTGAA 

CCTTTACTATAGCTTGACACTGAACATTGAGCCTTGATGTGTAGGATAGGT 

GGGAGGCTTTGAAGTGTGGACGCCAGTCTGCATGGAGCCGACCTTGAAAT 
ACCACCCTTTAATGTTTGATGTTCTAACGT 

The match between the T2 sequence and the C1/C2 sequence is 

CCCGGTAAACGGCGGCCGTAACTATAACGGTCCTAAGGTAGCGAAATTCC 

TTGTCGGGTAAGTTCCGACCTGCACGAATGGCGTAATGATGGCCAGGCTG 

TCTCCACCCGAGACTCAGTGAAATTGAACTCGCTGTGAAGATGCAGTGTA 

CCCGCGGCAAGACGGAAAGACCCCGTGAACCTTTACTATAGCTTGACACT 

GAACATTGAGCCTTGATGTGTAGGATAGGTGGGAGGCTTTGAAGTGTGGA 

CGCCAGTCTGCATGGAGCCGACCTTGAAATACCACCCTTTAATGTTTGATG 
TTCTAACGTTGACCCGTAATCCGGGTTGCGGACAGT 



A double stranded DNA loop of length 93.749 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 3227. This Tl control 
element has the DNA sequence 

AGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCA 
GG 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3329. This T2 control element has the DNA sequence 

CATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAG 
TCG 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
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The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
'xi short loops. 

2 15 

fjl A C1/C2 short loop on chromosome 1 whose identifier is 3225 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
111 as a RNA single strand that is 3'UTR to the gene rrlC and has the DNA sequence 

ill 20 AAACAGAATTTGCCTGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCAT 
% GCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTC 
13 CCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATTA 



The match between the Tl sequence and the C1/C2 sequence is 

25 

AGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCA 
GG 

The match between the T2 sequence and the C1/C2 sequence is 

30 

CATGCGAGAGTAGGGAACTGCCAGGCATCAAAT 
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Example of an archea transient connectron - M. jannaschii 



10 



In this example the existence of the T1-T2 (1 139-1 159) long loop is controlled by the 
C1/C2 (533) short loop. The expression of this C1/C2 short loop is controlled by the 
existence of the T1-T2 (532-622) long loop. The existence of this T1-T2 long loop is 
itself determined by the expression of the C1/C2 (1629) short loop. The C1/C2 (533) 
short loop is the transient connectron. 

1629 Chromosome 1 



* * 

Chromosome 1 | 
15 532 622 

I 533 I 



533 Chromosome 1 
20 I 

* * : 

Chromosome 1 | 
1139 1159 



25 



A double stranded DNA loop of length 78.672 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 532. This Tl control 
element has the DNA sequence 

30 

ATATGTTTGAAATTTGAAAATAAGAGTATTTAG 



This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 622. This T2 control element has the DNA sequence 

35 

TTGAAAATAAGAGCATTTAGAAGTTATTAATTAGTTCAAAGGATTTT 
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This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
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15 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 533 controls the expression 
20 of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 

expressed as a RNA single strand that is 3'UTR to the gene MJ0485 and has the DNA 
sequence 

ATTTTTATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGA 
25 GTTTATTGAATT 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

30 A C1/C2 short loop on chromosome 1 whose identifier is 1629 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene MJ1597 and has the DNA sequence 
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ATATGTTTGAAATTTGAAAATAAGAGTATTTAGAAGTTATTAATTAGTTCA 
AAGGATTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATAT 
TTGAGTTTATTGAATTATTCAGATTTTTAAAAATTA 

5 

The match between the Tl sequence and the C1/C2 sequence is 
ATATGTTTGAAATTTGAAAATAAGAGTATTTAG 
10 The match between the T2 sequence and the C1/C2 sequence is 

ATTTAGAAGTTATTAATTAGTTCAAAGGATTTT 



15 

A double stranded DNA loop of length 14.509 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 1139. This Tl control 
element has the DNA sequence 

20 ATTTATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTTAGCTGG 
TTTGATTGTTTAAAATATTTGAGTTTA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 1 159. This T2 control element has the DNA sequence 

25 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
TGAATTATTCAGATTTTTAAAAATTA 

This long Tl/r2 double stranded DNA loop modulates the expression of the 
30 following genes 
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MJ1096 MJ1097 tRNA-Arg-3 MJ1098 MJ1099 MJllOO MJllOl 
MJ1102 MJ1103 MJ1104 MJ1105 MJ1106 MJ1107 MJ1108 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 533 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene MJ0485 and has the DNA sequence 

ATTTTTATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGA 
GTTTATTGAATT 

The match between the Tl sequence and the C1/C2 sequence is 
ATTTTTATTTAATTTCTAAGGGTTAGCTGGTTTGATT 
The match between the T2 sequence and the C1/C2 sequence is 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
TGAATT 



Example of a single-celled transient connectron - S. cervesiae 

In this example the existence of the T1-T2 (2840-2859) long loop is controlled by the 
C1/C2 (298) short loop. The expression of this C1/C2 short loop is controlled by the 
existence of the TI-T2 (293-320) long loop. The existence of this T1-T2 long loop is 
itself determined by the expression of the C1/C2 (86) short loop. The C1/C2 (298) 
short loop is the transient connectron. 
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10 



86 Chromosome 1 



Chromosome 1 
293 320 
I 298 I 



298 Chromosome I 



+ * 

I Chromosome 7 

15 2840 2859 



A double stranded DNA loop of length 38.470 kilo-bases on chromosome 2 is 
20 bounded on the left by a Tl sequence whose identifier is 293. This Tl control 

element has the DNA sequence 

GAATTGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTA 
TCAATATATTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAA 
25 GCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAAT 
AGGATAATGAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGT 
ATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTTGAGGAGAAC 
TTCTAGT 

30 This double stranded DNA loop is bounded on the right by a T2 control element 

whose identifier is 320. This T2 control element has the DNA sequence 

AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTC 
GAGGAGAACTTCTAGTATATTCTGTA 



35 



This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 
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YBL005W-B TS(AGA)B YBL004W YBL003C YBL002W YBLOOIC 
YBROOIC YBR002C YBR003W YBR004C YBR005W YBR006W 
YBR007C YBR008C YBR009C YBROlOW YBROllC YBR012C 

This long TlAr2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 2 whose identifier is 298 controls the expression 
of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene YBL005W-B and has the 
DNA sequence 

ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTAT 

CAACTAATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGAT 

GACATAAGTTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCA 

AGGATTGATAATGTAATAGGATAATGAAACATATAAAACGGAATGAGGA 

ATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCT 

ATATCCTTGAGGAGAACTTCTAGTATATTCTGTATACCTAATATTATAGCC 

TTTATCAACAATGGAATCCCAACAATTATCTCAACATTC 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 86 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene YAR009C and has the DNA sequence 

ATCTATTACATTATGGGTGGTATGTTGGAATAGAAATCAACTATCATCTAC 
TAACTAGTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATG 
ACGCAAATGATGAGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCA 
AGGATTGATAATGTAATAGGATCAATGAATATAAACATATAAAACGGAAT 
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GAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGG 
ATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTATACCTAATATT 
ATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTCACCCAT 
TTCTCAGAA 

5 

The match between the Tl sequence and the C1/C2 sequence is 

AAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAAT 
ATAGATTCCATTTTGAGGATTCCTATATCCT 

10 

The match between the T2 sequence and the C1/C2 sequence is 

AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTC 
GAGGAGAACTTCTAGTATATTCTGTA 



A double stranded DNA loop of length 5.302 kilo-bases on chromosome 7 is bounded 
on the left by a Tl sequence whose identifier is 2840. This Tl control element has 
the DNA sequence 

TCTGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTATC 
AATATATTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAAGC 
TGTCATCGAAGTTAGAGGAAGCTGAAACGCAAGGATTGATAATGTAATAG 
GATCAATGAATATAAACATATAAAACGGAATGAGGAATAATCGTAATATT 
AGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAG 
AACTTCTAGTATATTCTGTATACCTAAATTATAGCCTTTATCAACAATGGA 
ATCCCAACAA 

30 This double stranded DNA loop is bounded on the right by a T2 control element 

whose identifier is 2859. This T2 control element has the DNA sequence 
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CTATCAACTAATAGTTATATTATCAATATATTATCATATACGGTGTTAAGA 
TGATGACATAAGTTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAA 
CGCAAGGATTGATAATGTAATAGGATCAATGAATATAAACATATAAAACG 
GAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTT 
5 GAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTATACCTAA 
TATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTCAC 
ATATTTCTCAT 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
10 short loops. 

A C1/C2 short loop on chromosome 2 whose identifier is 298 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene YBL005W-B and has the DNA sequence 

i 15 

ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTAT 
^0 CAACTAATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGAT 
S H GACATAAGTTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCA 

AGGATTGATAATGTAATAGGATAATGAAACATATAAAACGGAATGAGGA 
% 20 ATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCT 
W ATATCCTTGAGGAGAACTTCTAGTATATTCTGTATACCTAATATTATAGCC 
g TTTATCAACAATGGAATCCCAACAATTATCTCAACATTC 

The match between the Tl sequence and the C1/C2 sequence is 

25 

TGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTATCAA 

TATATTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAAGCTG 

TCATCGAAGTTAGAGGAAGCTGAA 

30 The match between the T2 sequence and the C1/C2 sequence is 
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CTATCAACTAATAGTTATATTATCAATATATTATCATATACGGTGTTAAGA 
TGATGACATAAGTTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAA 



Example of a multi-celled transient connectron - C. elegans 



In this example the existence of the T1-T2 (22072-22108) long loop is controlled by 
the C1/C2 (125) short loop. The expression of this Ci/C2 short loop is controlled by 
the existence of the T1-T2 (110-129) long loop. The existence of this T1-T2 long 
loop is itself determined by the expression of the C1/C2 (16859) short loop. The 
C1/C2 (125) short loop is the transient connectron. 



16859 Chromosome 4 



! Chromosome 1 

110 129 

I 125 I 



125 Chromosome 1 

* * >i: 

I Chromosome 5 | 

22072 22108 



A double stranded DNA loop of length 18,855 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 110. This Tl control 
element has the DNA sequence 

AGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGC 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 129. This T2 control element has the DNA sequence 
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TTCTCCCGCATTTTTTGTAGATCTACGTAGATCAAACCGAAATGAGGCACT 

TTCTGAATCCACGAGCTAGGCTTAAGCTTAGGCTTAAGCTTAGGCCTTTTC 
TCAGGCTTAGGCTTAGGCTTA 

5 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

ZC123.3 ZC123.2 

10 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 125 controls the expression 
O 15 of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 

Jl expressed as a RNA single strand that is 3'UTR to the gene ZC 123.3 and has the 

DNA sequence 

JL^ ACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGTTC 
ill 20 GGCAAACTCTTTCATTTCAATTTATGAGGGAAGCCAGAA 

O The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 

short loops. 

25 A C1/C2 short loop on chromosome 4 whose identifier is 16859 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene F58E2.7 and has the DNA sequence 

CTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTT 

30 AGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAG 

GCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCT 
TAAGCTTAGACTTA 
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The match between the Tl sequence and the C1/C2 sequence is 



AGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGC 

5 

The match between the T2 sequence and the C1/C2 sequence is 
TAGGCTTAAGCTTAGGCTTAAGCTTAGGC 
10 

A double stranded DNA loop of length 51.031 kilo-bases on chromosome 5 is 
bounded on the left by a Tl sequence whose identifier is 22072. This Tl control 
element has the DNA sequence 

15 

CGCAACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATGACCTA 
GTTCGGC 



This double stranded DNA loop is bounded on the right by a T2 control element 
20 whose identifier is 22108, This T2 control element has the DNA sequence 



TGACAATCGCCTGCCGGACAACGCGTGGAAAAGTGTCGTGTACTCCACAC 
GGACAAATACATTTAGTTTTACAACTAAAATCGAACCGCGACGCGACACG 
CAACGCGACGTAAATCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGT 
25 TCGGCAAACTCTTCTATTTC 



This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 



30 F36H9.3 F36H9.4 F36H9.5 F36H9,2 F36H9.1 F36H9.6 
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The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 125 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene ZC 123.3 and has the DNA sequence 

ACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGTTC 
GGCAAACTCTTTCATTTCAATTTATGAGGGAAGCCAGAA 

The match between the Tl sequence and the C1/C2 sequence is 

ACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATG 
The match between the T2 sequence and the C1/C2 sequence is 

CGTAAATCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGTTCGGCAAA 
CTCTT 
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8. Self-limiting connectrons occur in prokaryotes, archea, single-celled 
eukaryotes and multi-celled eukaryotes 

A class of connectron relationships exist that permit one C1/C2 short loop to control 
the existence of the T1-T2 long loop that surrounds it. These connectron relationships 
are described as "self-limiting". Self-limiting connectrons exist in prokaryotes, 
archea, single-celled eukaryotes and multi-celled eukaryotes. 

Example of a prokaryotic self-limiting connectrons - E. coli 

In this example the existence of the T1-T2 (1704-1718) long loop is controlled by 
two C1/C2 (1705 and 1713) short loops. The expression of these C1/C2 short loops is 
controlled by the existence of the T1-T2 (1704-1718) long loop. The existence of this 
T1-T2 long loop is itself determined by the expression of the two C1/C2 (1705 and 
1713) short loops. The C1/C2 (1705 and 1713) short loops are the self-limiting 
connectrons. 

1705 Chromosome 1 
1713 Chromosome 1 

* * * 

I Chromosome 1 | 

1704 1718 
I 1705 1713 I 



A double stranded DNA loop of length 15.259 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 1704. This Tl control 
element has the DNA sequence 

CGCCCCGTTCACACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACT 
GTTAATCCGTATGTCACTGGT 
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This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 1718. This T2 control element has the DNA sequence 

TTCAGTCGGTAGAACGGCGGACTGTTAATCCGTATGTCACTGGTTCGAGTC 
CAGTCAGAGGAGCCAAATTC 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

asnT bl978 bl979 bl980 shiA amn bl983 asnW 
yeeO asnlJ 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 1705 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene and has the DNA 
sequence 

CGCCCCGTTCACACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACT 
GTTAATCCGTATGTCACTGGTTCGAGTCCAGTCAGAGGAGCCAAATTC 

A C1/C2 short loop on chromosome 1 whose identifier is 1713 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3^UTR to the gene asnW and has the 
DNA sequence 

CACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATCCGTAT 
GTCACTGGTTCGAGTCCAGTCAGAGGAGCCAAATT 
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The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 



10 



15 



A C1/C2 short loop on chromosome 1 whose identifier is 1705 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene and has the DNA sequence 

CGCCCCGTTCACACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACT 
GTTAATCCGTATGTCACTGGTTCGAGTCCAGTCAGAGGAGCCAAATTC 

The match between the Tl sequence and the C1/C2 sequence is 

CGCCCCGTTCACACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACT 
GTTAATCCGTATGTCACTGGT 

The match between the T2 sequence and the C1/C2 sequence is 

TTCAGTCGGTAGAACGGCGGACTGTTAATCCGTATGTCACTGGTTCGAGTC 
CAGTCAGAGGAGCCAAATTC 

A CI/C2 short loop on chromosome 1 whose identifier is 1713 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene asnW and has the DNA sequence 



25 CACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATCCGTAT 
GTCACTGGTTCGAGTCCAGTCAGAGGAGCCAAATT 

The match between the Tl sequence and the CI/C2 sequence is 

30 CACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATCCGTAT 
GTCACTGGT 



20 
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The match between the T2 sequence and the C1/C2 sequence is 

TTCAGTCGGTAGAACGGCGGACTGTTAATCCGTATGTCACTGGTTCGAGTC 
CAGTCAGAGGAGCCAAATT 

5 



Example of a archea self-limiting connectrons — M. jannaschii 

In this example the existence of the T1-T2 (1447-1471) long loop is controlled by 
two C1/C2 (1448 and 1470) short loops. The expression of these C1/C2 short loops is 
controlled by the existence of the T1-T2 (1447-1471) long loop. The existence of this 
T1-T2 long loop is itself determined by the expression of the two C1/C2 (1705 and 
1713) short loops. The C1/C2 (1448 and 1470) short loops are the self-limiting 
connectrons, 

1448 Chromosome 1 
1470 Chromosome 1 

* * :({ 

I Chromosome 1 | 

1447 1471 
I 1448 1470 I 



A double stranded DNA loop of length 22.675 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 1447. This Tl control 
element has the DNA sequence 

TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTA 
CCATTACTTGGAAATCTATTTAAAACCTCTTTAATCTTATGATA 

This double stranded DNA loop is bounded on the right by a T2 control element 
35 whose identifier is 147L This T2 control element has the DNA sequence 
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10 



15 



20 



25 



30 



CAACTAACAACCGTATCGAATTTACCATTACTTGGAAATCTATTTAAAACC 
TCTTTAATCTTGTGATAATAAATTCTAATCGATTCGTGACTTAT 



5 This long T1/T2 double stranded DNA loop modulates the expression of the 

following genes 

MJ1402 MJ1403 MJI404 MJ1405 MJ1406 MJ1407 MJ1408 
MJ1409 MJ1410 MJ1411 MJ1412 MJ1413 MJ1414 MJ1415 
10 MJ1416 MJ1417 MJ1418 MJ1419 MJ1420 



This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

B 15 A C1/C2 short loop on chromosome 1 whose identifier is 1448 controls the 

%^ expression of the genes of one or more other TI/T2 long loops. This C1/C2 short 

loop is expressed as a RNA single strand that is 3'UTR to the gene MJ1401 and has 
I S the DNA sequence 

|S 20 TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTA 
W CCATTACTTGGAAATCTATTTAAAACCTCTTTAATCTTATGATAATAAA^ 

CTAATCGATTCGTGACTTAT 



A C1/C2 short loop on chromosome 1 whose identifier is 1470 controls the 
25 expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 

loop is expressed as a RNA single strand that is 3'UTR to the gene MJ1420 and has 
the DNA sequence 



TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTA 
30 CCATTACTTGGAAATCTATTTAAAACCTCTTTAATCTTGTGATAATAAATT 
CTAATCGATTCGTG 
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The expression of genes in this Tl/r2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 1470 controls the 
expression of the genes in this T1/T2 long loop.This C1/C2 short loop is expressed as 
a RNA single strand that is 3'UTR to the gene MJ1420 and has the DNA sequence 

TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTA 

CCATTACTTGGAAATCTATTTAAAACCTCTTTAATCTTGTGATAATAAATT 

CTAATCGATTCGTG 

The match between the Tl sequence and the C1/C2 sequence is 

TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTA 
CCATTACTTGGAAATCTATTTAAAACCTCTTTAATCTT 

The match between the T2 sequence and the C1/C2 sequence is 

CAACTAACAACCGTATCGAATTTACCATTACTTGGAAATCTATTTAAAACC 
TCTTTAATCTTGTGATAATAAATTCTAATCGATTCGTG 

A C1/C2 short loop on chromosome 1 whose identifier is 1448 controls the 
expression of the genes in this Tl/r2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene MJ1401 and has the DNA sequence 

TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTA 

CCATTACTTGGAAATCTATTTAAAACCTCTTTAATCTTATGATAATAAATT 

CTAATCGATTCGTGACTTAT 

The match between the Tl sequence and the C1/C2 sequence is 
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O 20 



25 



TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTA 
CCATTACTTGGAAATCTATTTAAAACCTCTTTAATCTTATGATA 

The match between the T2 sequence and the C1/C2 sequence is 

CAACTAACAACCGTATCGAATTTACCATTACTTGGAAATCTATTTAAAACC 
TCTTTAATCTT 



Example of a single-celled self-limiting connectron - S. cervesiae 



In this example the existence of the T1-T2 (293-320) long loop is controlled by 
C1/C2 (298) short loop. The expression of this C1/C2 short loop is controlled by the 
15 existence of the T1-T2 (293-320) long loop. The existence of this T1-T2 long loop is 

itself determined by the expression of the C1/C2 (298) short loop. The C1/C2 (298) 
short loop is the self-limiting connectron. 



298 Chromosome 2 



Chromosome 2 
293 320 
298 I 



A double stranded DNA loop of length 38.470 kilo-bases on chromosome 2 is 
bounded on the left by a Tl sequence whose identifier is 293, This Tl control 
30 element has the DNA sequence 



GAATTGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTA 

TCAATATATTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAA 

GCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAAT 
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AGGATAATGAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGT 

ATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTTGAGGAGAAC 

TTCTAGT 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 320, This T2 control element has the DNA sequence 

AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTC 
GAGGAGAACTTCTAGTATATTCTGTA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

YBL005W-B TS(AGA)B YBL004W YBL003C YBL002W YBLOOIC 
YBROOIC YBR002C YBR003W YBR004C YBR005W YBR006W 
YBR007C YBR008C YBR009C YBROlOW YBROllC YBR012C 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 2 whose identifier is 298 controls the expression 
of the genes of one or more other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the gene YBL005W-B and has the 
DNA sequence 

ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTAT 

CAACTAATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGAT 

GACATAAGTTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCA 

AGGATTGATAATGTAATAGGATAATGAAACATATAAAACGGAATGAGGA 

ATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCT 

ATATCCTTGAGGAGAACTTCTAGTATATTCTGTATACCTAATATTATAGCC 

TTTATCAACAATGGAATCCCAACAATTATCTCAACATTC 
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The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops, 

A C1/C2 short loop on chromosome 2 whose identifier is 298 controls the expression 
of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene YBL005 W-B and has the DNA sequence 

ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTAT 

CAACTAATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGAT 

GACATAAGTTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCA 

AGGATTGATAATGTAATAGGATAATGAAACATATAAAACGGAATGAGGA 

ATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCT 

ATATCCTTGAGGAGAACTTCTAGTATATTCTGTATACCTAATATTATAGCC 

TTTATCAACAATGGAATCCCAACAATTATCTCAACATTC 

The match between the Tl sequence and the C1/C2 sequence is 

TGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTATCAA 

TATATTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAAGCTG 

TCATCGAAGTTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAATAGGA 

TAATGAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTA 

GAAATATAGATTCCATTTTGAGGATTCCTATATCCTTGAGGAGAACTTCTA 

GT 

The match between the T2 sequence and the C1/C2 sequence is 
AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCT 



Example of a multi-celled self-limiting connectron - C. elegans 
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In this example the existence of the T1-T2 (293-320) long loop is controlled by 
C1/C2 (298) short loop. The expression of this C1/C2 short loop is controlled by the 
existence of the T1-T2 (293-320) long loop. The existence of this T1-T2 long loop is 
itself determined by the expression of the C1/C2 (298) short loop. The C1/C2 (298) 
short loop is the self-limiting connectron. 

17155 Chromosome 4 

* * . * 

I Chromosome 4 | 

17154 17190 
I 17155 I 



A double stranded DNA loop of length 89.919 kilo-bases on chromosome 4 is 
bounded on the left by a Tl sequence whose identifier is 17154. This Tl control 
element has the DNA sequence 

AAATTTCCGGCAAATCGGCAAACTGGCAA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 17190. This T2 control element has the DNA sequence 

AATTTGCCGATTTGCCGAATTTGTCGACA 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following genes 

R08C7,11 M01H9,2 M01H9.3 M01H9.4 M01H9.1 ZK180.1 ZK180.2 
ZK180,3 ZK180,4 ZK180.5 ZK180.6 ZK185.3 ZK185,2 
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This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 4 whose identifier is 17155 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene R08C7.1 and has 
the DNA sequence 

AAATTTCCGGCAAATCGGCAAACTGGCAATTTGCCGATTTGCCGAATTTGT 
CGACA 

A C1/C2 short loop on chromosome 4 whose identifier is 17171 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to the gene ZK180.2 and has 
the DNA sequence 

TGGAAATTTCAGAATTTCAATTTTAATCGGCAAAATTGTACGCATCCTATG 
AATTT 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 4 whose identifier is 17155 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene R08C7. 1 and has the DNA sequence 

AAATTTCCGGCAAATCGGCAAACTGGCAATTTGCCGATTTGCCGAATTTGT 
CGACA 

The match between the Tl sequence and the C1/C2 sequence is 
AAATTTCCGGCAAATCGGCAAACTGGCAA 
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The match between the T2 sequence and the C1/C2 sequence is 



AATTTGCCGATTTGCCGAATTTGTCGACA 
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9. 



Geneless connectrons exist in single-celled and multi-celled eukaryotes 



Normally T1-T2 long loops contain genes whose expression is regulated by the 
existence of the long loop. When a T1-T2 long loop does not contain any genes it is 
described as being "geneless". The existence of the T1-T2 long loop is itself 
controlled by one or more C1/C2 short loops that may be on the same or different 
chromosomes. The geneless T1-T2 long loops must contain one or more C1/C2 short 
loops. 

Example of a single-celled geneless connectron - S. cervesiae 

In this example the existence of the T1-T2 (1537-1559) long loop is controlled by 
three C1/C2 (3789, 5289 and 5753) short loops. The expression of 21 C1/C2 (1538 
through 1558) short loops are controlled by the existence of the T1-T2 (1537-1559) 
long loop. 

3789 Chromosome 9 
5289 Chromosome 12 
5753 Chromosome 13 

* * ^ 

I Chromosome 4 | 

1537 1559 
I 1538 through 1558 | 



A double stranded DNA loop of length 4.825 kilo-bases on chromosome 4 is bounded 
on the left by a Tl sequence whose identifier is 1537. This Tl control element has 
the DNA sequence 

ATGAGATATATGTGGGTAATTAGATAATTGTTGGGATTCCATTGTTGATAA 
AGGCTATAATATTAGGTATACAGAATATACTAGAAGTTCTCCTCGAGGAT 
TTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTATAAATAT 
TATTATCATCGTTTTATATGTTAATATTCATTGATCCTATTACATTATCAAT 
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CCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGT 
CATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATACTAGTTA 
GTAGATGATAGTTGATTTTTATTCCAACATACCACCCATAATGTAATAGAT 
CTAAT 

5 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 1559. This T2 control element has the DNA sequence 

ATGAGATATATGTGGGTAATTAGATAATTGTTGGGATTCCATTGTTGATAA 
1 0 AGGCTATAATATTAGGTATAC AGAATATACTAGAAGTTCTCCTCGAGGAT 

TTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTATAAATAT 
TATTATCATCGTTTTATATGTTAATATTCATTGATCCTATTACATTATCAAT 
CCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGT 
5 CATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATACTAGTTA 
51 15 GTAGATGATAGTTGATTTTTATTCCAACATACCACCCATAATGTAATAGAT 

111 There are no genes controlled by this T1/T2 loop. 



20 This long T1/T2 double stranded DNA loop modulates the expression of the 

following C1/C2 short loops 

A C1/C2 short loop on chromosome 4 whose identifier is 1538 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
25 loop has the DNA sequence 

ATGAGATATATGTGGGTAATTAGATAATTGTTGGGATTCCATTGTTGATAA 
AGGCTATAATATTAGGTATACAGAATATACTAGAAGTTCTCCTCGAGGAT 
TTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTATAAATAT 
30 TATTATCATCGTTTTATATGTTAATATTCATTGATCCTATTACATTATCAAT 
CCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGT 
CATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATACTAGTTA 
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GTAGATGATAGTTGATTTTTATTCCAACATACCACCCATAATGTAATAGAT 
CTAATGAATCCATTTGTTTGTTAATAGTTT 

This T1-T2 loop also modulates the C1/C2 short loops numbered 1539 to 1557 

A C1/C2 short loop on chromosome 4 whose identifier is 1558 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 

AGCTTCTCATAACTTATGTCATCATCTTAACACCGTATATGATAATATATT 

GATAATATAACTTGTTGGAATAAAAATCAACTATCATCTACTAACTAGTAT 

TTACGTTACTAGTATATTATCATATACGGTGTTAGAAGATGACGCAAATG 

ATGAGAAATAGTCATCTAAATTAGTGGAAGCTGA...GTCTATCTGGCGAAT 

ATAAATTTTTACGCTACACACGTCATCGACATCTAAATATGACAGTCGCTG 

AACTGTTCTTAGATATCCATGCTATTTATGAAGAACAACAGGGATCGAGA 

AACAG 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 9 whose identifier is 3789 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YIL059C and has the DNA 
sequence 

TTTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCA 
GCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACA 
CCGTATATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAG 
TTGATTTTTATTCCAACAGTAT 

The match between the Tl sequence and the C1/C2 sequence is 
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TTTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCA 
GCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACA 
CCGTATATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAG 
TTGATTTTTATTCCAACA 

5 

The match between the T2 sequence and the C1/C2 sequence is 

TTTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCA 
GCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACA 
1 0 CCGTATATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAG 
TTGATTTTTATTCCAACA 

A C1/C2 short loop on chromosome 12 whose identifier is 5289 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
15 as a RNA single strand that is 3'UTR to the gene YLR301W and has the DNA 

sequence 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATA 
TTAGGTATGTAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCAT 
20 AAAAGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTT 
TTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAG 
CTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACAC 
CGTATATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGT 
TGATTTTTATTCCAACAC 

25 

The match between the Tl sequence and the C1/C2 sequence is 

AGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAA 
TCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTA 
30 ATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAA 
TTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGAT 
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AATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATT 
CCAACA 

The match between the T2 sequence and the C1/C2 sequence is 

5 

AGGATTTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTATA 
AATATTATTATCATCGTTTTATATGTTAATATTCATTGATCCTATTACATTA 
TCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCTCATCATT 
TGCGTCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATACT 
10 AGTTAGTAGATGATAGTTGATTTTTATTCCAACA 

A C1/C2 short loop on chromosome 13 whose identifier is 5753 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YMR044W and has the DNA 
15 sequence 

TTGAGAAATGGGGGAATGTTGAGATAATTGTTGGGATTCCATTGTTGATA 
AAGGCTATAATATTAGGTATACAGAATATACTAGAAGTTCTCCTCAAGGA 
TATAGGAATCCTCAAAATGGAATCTATATTTCTACATACTAATATTACGAT 
20 TATTCCTCATTCCGTTTTATATGTTTCATTATCCTATTACATTATCAATCCT 
TGCACTTCAGCTTCCTCTAACTTCGATGACAGCTTCTCATAACTTATGTCA 
TCATCTTAACACCGTATATGATAATATATTGATAATATAACTATTAGTTGA 
TAGACGATAGTGGATTTTTATTCCAACAT 

25 The match between the Tl sequence and the C1/C2 sequence is 

AGATAATTGTTGGGATTCCATTGTTGATAAAGGCTATAATATTAGGTATAC 
AGAATATACTAGAAGTTCTCCTC 

30 The match between the T2 sequence and the C1/C2 sequence is 



- 168- 



TTGTTGGGATTCCATTGTTGATAAAGGCTATAATATTAGGTATACAGAATA 
TACTAGAAGTTCTCCTCAAGGAT 



5 

Two examples of multi-celled geneless connectrons - C, elegans 



In the first example the existence of the T1-T2 (2342-2344) long loop is controlled 
by the C1/C2 (24114) short loop. The expression of one C1/C2 (2343) short loop is 
10 controlled by the existence of the T1-T2 (2342-2344) long loop. 



241 14 Chromosome 5 



15 



* * * 

Chromosome 1 
2342 2344 
I 2343 



20 



In the second example the existence of the T1-T2 (29221-29262) long loop is 
controlled by the C1/C2 (24114) short loop. The expression of one C1/C2 (2343) 
short loop is controlled by the existence of the T1-T2 (2342-2344) long loop. 

25 

4291 Chromosome 1 

* * * 

Chromosome 5 
30 29221 29262 

I 29222 through 29261 | 



35 A double stranded DNA loop of length 67.059 kilo-bases on chromosome 1 is 

bounded on the left by a Tl sequence whose identifier is 2342. This Tl control 
element has the DNA sequence 
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TGAAAACTACAGTAATTCTTTAAATGACTACTGTAGC 



This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 2344. This T2 control element has the DNA sequence 

CTACTGTAGCGCTTGTGTCGATTTACGGGCTCGATTT 



There are no genes controlled by this T1/T2 loop. 

10 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 



A C1/C2 short loop on chromosome 1 whose identifier is 2343 controls the 
15 expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 

loop has the DNA sequence 

TCGACACAAGCGCTACAGTAGCTATTTAAAGAATTACTGTAGTTTTCGCTA 
CGAGATATTT 

20 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 5 whose identifier is 24114 controls the 
25 expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 

as a RNA single strand that is 3'UTR to the gene C13F10.5 and has the DNA 
sequence 

GCGAAAACTACAGTAATTCTTTAAATGACTACTGTAGCGCTTGTGTCGATT 
30 TACGGGCTCGATTTTCG 

The match between the Tl sequence and the C1/C2 sequence is 
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GAAAACTACAGTAATTCTTTAAATGACTACTGTAGC 
The match between the T2 sequence and the C1/C2 sequence is 
CTACTGTAGCGCTTGTGTCGATTTACGGGCTCGATTT 



A double stranded DNA loop of length 41.297 kilo-bases on chromosome 5 is 
bounded on the left by a Tl sequence whose identifier is 29221. This Tl control 
element has the DNA sequence 

TTTAAATTTCCCGCCAAAAATTGACTGAAAATTTGGATTTTCTTTCCAAAA 
ATTGACAGAAA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 29262, This T2 control element has the DNA sequence 

TGAAAATTTGAATTTCCCGCCAAAAATTAAC 
There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 5 whose identifier is 29222 controls the 
expression of the genes of one or more other T1/T2 long loops. This CI/C2 short 
loop has the DNA sequence 

AATTTCCCGCCAAAAATTGACTGAAAATTTGGATTTTCTTTCCAAAAATTG 
ACAGAAA 
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This TI-T2 loop also modulates the C1/C2 short loops numbered 29223 to 29260 

A C1/C2 short loop on chromosome 5 whose identifier is 29261 controls the 
5 expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 

loop has the DNA sequence 

AAAATTGACTGAAAATTTGAATTTCCAGCCAAAAATTGACTGAAAATTTG 
AATT 

10 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

.|j A C1/C2 short loop on chromosome 1 whose identifier is 4291 controls the 

JS 15 expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 

111 as a RNA single strand that is 3'UTR to the gene Y43F8C.5 and has the DNA 

^ sequence 

L AAAATTAACTGAAAATTTGAATTTCCCGCCAAAAATTGACTGAAAATTTG 
IJl 20 AATTTCCCGCCAAAAAAAATTGACTGAAAATTTGAATTTCCCGCCAAAAA 
}^ TTGACTGAAAATTTGAATTTCCCGCCAAAAATTAATTGAAt^ATTTG/^ 

u cccgccaaaaattaattgaaactttgaattttcaa...atttcccgccaaaa 

attaattgaaactttgaattttcaaatttcccgccaaaaattgactgaaaa 
tttgaatttcccgccaaaaattaattgaaaatttgaatttttgaatttc^ 
25 gccaaaaatgactga 



The match between the Tl sequence and the C1/C2 sequence is 



TTTAAATTTCCCGCCAAAAATTGACTGAAAATTTG 

30 

The match between the T2 sequence and the C1/C2 sequence is 
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AAAAAAATTGACTGAAAATTTGAATTTCCCGCCAAAAATTGA 



t 
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10. One connectron controls many geneless connectrons in single-celled and 
multi-celled eukaryotes 



10 



20 



One C1/C2 short loop can control the existence of many geneless T1-T2 long loops. 
Example of a single-celled geneless connectron - S. cervesiae 

In this example the existence of the three T1-T2 (1142-1156, 1242-1272 and 7102- 
71 17) long loops is controlled by the C1/C2 (5289) short loop. 

5289 Chromosome 12 



* * : 

I Chromosome 4 | 

15 1142 1156 

1143 through 1155 



5289 Chromosome 12 



I Chromosome 4 | 

1243 1272 
I 1244 through 1271 



25 5289 Chromosome 12 



I Chromosome 5 | 

7102 7117 
30 I 7103 through 71 16 



A double stranded DNA loop of length 5.337 kilo-bases on chromosome 4 is bounded 
35 on the left by a Tl sequence whose identifier is 1142. This Tl control element has 

the DNA sequence 

ATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGG 
TATGTAGATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAG 
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GGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCATTTTATA 
TGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCC 
ACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTAT 
ATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTT 

TTATTCCAACA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is U 56. This T2 control element has the DNA sequence 

TTTTAATAAGGCAATAATATTAGGTATGTAGATATACTAGAAGTTCTCCTC 

CAGGATTTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTAT 

AAATATTATTATCATCATTTTATATGTTAATATTCATTGATCCTATTACATT 

ATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCTCATCAT 

TTGCGTCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATAC 

TAGTTAGTAGATGATAGTTGATTTTTATTCCAACAAGAA 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 4 whose identifier is 1143 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 

ATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGG 

TATGTAGATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAG 

GGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCATTTTATA 

TGTrAATATTCATTGATCCTATTACATTATCAAT...CTCTAAGTCTCATTGCC 

TTTGTGCCAAAAAATCTGTTTCTAAATTTCTCTTCATTTGTAGACTTAATTA 

TACTGATCGTTGATCTACTATCAGTAAGTAAGCCTTTAATAATTGGTTTCT 

TGTTAAGTTCTTGCACAAGGTGACTGAGGTTATTCAATAGCGG 
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This T1-T2 loop also modulates the C1/C2 short loops numbered 1 144 to 11 54 

A C1/C2 short loop on chromosome 4 whose identifier is 1155 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 

GAGGAGAACTTCTAGTATATCTACATACCTAATATTATTGCCTTATTAAAA 
ATGGAATCCCAACAATTA 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

A C1/C2 short loop on chromosome 12 whose identifier is 5289 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YLR301W and has the DNA 
sequence 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATA 

TTAGGTATGTAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCAT 

AAAAGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTT 

TTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAG 

CTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACAC 

CGTATATGATAATATACTAGTACGTAAATACTAGTTAGTAGATGATAGTT 

GATTTTTATTCCAACAC 

The match between the Tl sequence and the C1/C2 sequence is 

ATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGG 
TATGTAGA 

The match between the T2 sequence and the C1/C2 sequence is 
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TTTTAATAAGGCAATAATATTAGGTATGTAGA 



A double stranded DNA loop of length 5.251 kilo-bases on chromosome 4 is bounded 
on the left by a Tl sequence whose identifier is 1243. This Tl control element has 
the DNA sequence 

CGTGTTTTATCTCATGTTGTTCGTTTTGTTATTGAGATATATGTGGGTAATT 

AGATAATTGTTGGGATTCCATTGTTGATAAAGGCTATAATATTAGGTATAC 

AGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAA 

TCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTA 

ATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAA 

TTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGAT 

AATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATT 

CCAACA 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 1272. This T2 control element has the DNA sequence 

TGAGATATATGTGGGTAATTAGATAATTGTTGGGATTCCATTGTTGATAAA 

GGCTATAATATTAGGTATACAGAATATACTAGAAGTTCTCCTCGAGGATTT 

AGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTATAAATATTA 

TTATCATCGTTTTATATGTTAATATTCATTGATC...TATACTAGTAACGTAA 

ATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAACAGTTATAAGGTTG 

TTTCATATGTGTTTTATGAA 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 
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A C1/C2 short loop on chromosome 4 whose identifier is 1244 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 

5 

TTTATCTCATGTTGTTCGTTTTGTTATTGAGATATATGTGGGTAATTAGATA 
ATTGTTGGGATTCCATTGTTGATAAAGGCTATAATATTAGGTATACAGAAT 
ATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCTGC 
AATTCTAC AC AATTCTATAAATATTATTATC AT. . . GTCTCG ATGTAGTATAC 
1 0 GTATAAATTATTACCTGATACTTCATCTCTAAGTCTCATTGCCTTTGTGCCA 
AAAAATCTGTTTCTAAATTTCTCTTCATTTGTAGACTTAATTATACTGATCG 
TTGATCTACTATCAGTAAGT 

This T1-T2 loop also modulates the C1/C2 short loops numbered 1245 to 1270 

2 15 

fi A C1/C2 short loop on chromosome 4 whose identifier is 1271 controls the 

;H expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 

11 loop has the DNA sequence 

1 20 TGTTGTATCTCAAAATGAGATATGTCAGTATGACAATACGTCATCCTAAAC 
^ GTTCATAAAACACATATGAAACAACCTTATAACTGTTGGAATAAAAATCA 
3 ACTATCATCTACTAACTAGTATTTACGTTACTAGTATATTATCATATACGG 

TGTTAGAAGATGACGCAAATGATGAGAAATAGTC . . .CAACAATGGAATCC 
CAACAATTATCTAATTACCCACATATATCTCATGGTAGCGCCTGTGCTTCG 
25 GTTACTTCTAAGGAAGTCCACACAAATCAAGATCCGTTAGACGTTTCAGC 
TTCCAAAA 



The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

30 

A C1/C2 short loop on chromosome 12 whose identifier is 5289 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
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as a RNA single strand that is 3'UTR to the gene YLR301W and has the DNA 
sequence 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATA 
5 TTAGGTATGTAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCAT 
AAAAGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTT 
TTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAG 
CTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACAC 
CGTATATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGT 
10 TGATTTTTATTCCAACAC 

The match between the Tl sequence and the C1/C2 sequence is 



5 AGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAA 
52 15 TCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATCT 
m ATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAA 

TTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGAT 
in AATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATT 

CCAACA 

Jif The match between the T2 sequence and the C1/C2 sequence is 



AGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAA 
TCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTA 
25 ATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAA 
TTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGAT 
AATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATT 
CCAACA 

30 
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A double stranded DNA loop of length 5.296 kilo-bases on chromosome 15 is 
bounded on the left 

by a Tl sequence whose identifier is 7102. This Tl control element has the DNA 
sequence 

5 

CATGATTAATATGACCAATCGGCGTGTGTTTTTGAAAAGTGGGTGAATTTT 
GAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGTATGT 
AGAATGTACTAGAAGTTCTCCTCAAGGATTTAGGAATCCATGAAAGGGAA 
TCTGCAATTCTACACAATTCTATAAATATTATTATCATCATTTTATATGTTA 
1 0 ATATTCATTGATCCTATTAC ATTATCAATCCTTGCGTTTC AGCTTCCACTAA 

TTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGAT 
AATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATT 
CCAACA 

ID 15 This double stranded DNA loop is bounded on the right by a T2 control element 

m whose identifier is 7117. This T2 control element has the DNA sequence 

In TGAAAAGTGGGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAG 

JL, GCAATAATATTAGGTATGTAGAATGTACTAGAAGTTCTCCTCAAGGATTT 

ili 20 AGGAATCCATGAAAGGGAATCTGCAATTCTACACAATTCTATAAATATTA 

y TTATCATCATTTTATATGTTAATATTCATTGATCCTATTACATTATCAATCC 

13 TTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCA 

H TCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATACTAGTTAGT 

AGATGATAGTTGATTTTTATTCCAACAGTTTTATATACCTCTCTTATTTAGT 

25 ATAAGAA 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the expression of the 
30 following C 1/C2 short loops 
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A C1/C2 short loop on chromosome 15 whose identifier is 7103 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 



5 AAGAACATTGCTGATGTGATGACAAAACCTCTTCCGATAAAAACATTTAA 
ACTATTAACTAACAAATGGATTCATTAGATCTATTACATTATGGGTGGTAT 
GTTGGAATAAAAATCAACTATCATCTACTAACTAGTATTTACGTTACTAGT 
ATATTATCATATACGGTGTTAGAAGATGACGCAAATGATGAGAAATAGTC 
ATCTAAATTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAATAGGATC 
1 0 AATGAATATTAAC ATATAAAATG ATGATAATA ATATTTATAGAATTGTGT 

AGAATTGCAGATTCCCTTTCATGGATTCCTAAATCCTTGAGGAGAACTTCT 
AGTA 

^% This T1-T2 loop also modulates the C1/C2 short loops numbered 7104 to 71 15 

CO 15 

'Zl A C1/C2 short loop on chromosome 15 whose identifier is 7116 controls the 

=0 expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 

I n loop has the DNA sequence 

J|i 20 CCATTCTGTGGAGGTGGTACTGAAGCAGGTTGAGGAGAGACATGATGATG 
m GTTCTCTGGAACAGCT 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 

25 

A C1/C2 short loop on chromosome 12 whose identifier is 5289 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene YLR301W and has the DNA 
sequence 

30 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATA 
TTAGGTATGTAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCAT 
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AAAAGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTT 
TTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAG 
CTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACAC 
CGTATATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGT 
5 TGATTTTTATTCCAACAC 

The match between the Tl sequence and the C1/C2 sequence is 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATA 
10 TTAGGTATGTAGAAT 

The match between the T2 sequence and the C1/C2 sequence is 

y GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATA 
£0 15 TTAGGTATGTAGAAT 



n 20 



25 



30 



Example of a multi-celled geneless connectron - C. elegans 

In this example the existence of the three T1-T2 (1142-1156, 14840-15042 and 
15365-15627) long loops is controlled by the C1/C2 (16760) short loop. 

16760 Chromosome 4 



I Chromosome 4 | 

1142 1156 
I 3103 through 3119 



16760 Chromosome 4 



* * * 

I Chromosome 4 | 

35 14840 15042 

14841 through 15041 
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16760 Chromosome 4 



I Chromosome 5 | 

15365 15627 
I 15366 through 15625 



10 

A double stranded DNA loop of length 15.894 kilo-bases on chromosome 1 is 
bounded on the left by a Tl sequence whose identifier is 310L This Tl control 
element has the DNA sequence 

1 5 C AAATCGGC AAATTGCCGGAATTGAAC ATTTCC 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 3120. This T2 control element has the DNA sequence 

20 AAACGATTTTTCCGGCAAATCGGCAAATTGCCGGAATTGTAATTTCCGGC 
AAAT 

There are no genes controlled by this T1/T2 loop. 

25 This long T1/T2 double stranded DNA loop modulates the expression of the 

following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 3103 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
30 loop has the DNA sequence 

TTAAAATTTCCGGCAAATCGGCAAATTGGCAGAAATGAAACTCACGGCAA 
ATCGG 

35 This T1-T2 loop also modulates the C1/C2 short loops numbered 3104 to 31 18 
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A C1/C2 short loop on chromosome 1 whose identifier is 3119 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 

5 

CCCGCATTTTTTGTAGATCAAACCGTAATGGGACGGCCTGGCAACACGTG 
ATTTTCCAAAT 



The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
10 short loops. 

A C1/C2 short loop on chromosome 4 whose identifier is 16760 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene T23EL2 and has the DNA sequence 



GGCAAATTGCCGAAATTGAACATTTCCGGCAAATCGGCAAATTGCCGGAA 

TTGAACATTTCCGGCAAATCGGCAAATTGCCGGAATTGAACATTTCCGGC 

AAATCGGCAAATTGCCGGAATTGA 



IJl 20 The match between the Tl sequence and the C1/C2 sequence is 



CAAATCGGCAAATTGCCGGAATTGAACATTTCC 



The match between the T2 sequence and the C1/C2 sequence is 

25 

TTTCCGGCAAATCGGCAAATTGCCGGAATTG 



30 A double stranded DNA loop of length 86.977 kilo-bases on chromosome 3 is 

bounded on the left by a Tl sequence whose identifier is 14840. This Tl control 
element has the DNA sequence 
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AAAAATTTCCGGCAAGTCGGCAATTTTCCGAAAATGAAAATTTCCGGCAA 
ATCGGCAAATTGCCGGAATTGAAAATTCCTGGCAAATCAGCAAATTTGCG 
GCAAATCGGCAATTTGCCGAAAATGAAAATTTCCGGCAAAT 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 15042. This T2 control element has the DNA sequence 

CAAATCGGTAGGTAAATTGGCCAAACTTGAAAATTTCCGGCAAATCGGCA 
AATTCCGCGAACTGAACATTTCCGGCAAATCGGCAAATTGCTCGAACT 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 3 whose identifier is 14841 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 

AAAAATTTCCGGCAAGTCGGCAATTTTCCGAAAATGAAAATTTCCGGCAA 
ATCGGCAAATTGCCGGAATTGAAAATTCCTGGCAAATCAGCAAATTTGCG 
GCAAATCGGCAATTTGCCGAAAATGAAAATTTCCGGCAAAT 

This T1-T2 loop also modulates the C1/C2 short loops numbered 14842 to 15040 

A C1/C2 short loop on chromosome 3 whose identifier is 15041 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 

CGGCAATTGCCGTTCGGCAATTTGCCAATTTGCCGGAAATTTTCAATTCCG 
GCAA 
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The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 



5 A C1/C2 short loop on chromosome 4 whose identifier is 16760 controls the 

expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene T23E1.2 and has the DNA sequence 

GGCAAATTGCCGAAATTGAACATTTCCGGCAAATCGGCAAATTGCCGGAA 
1 0 TTGAACATTTCCGGCAAATCGGCAAATTGCCGGAATTGAACATTTCCGGC 
AAATCGGCAAATTGCCGGAATTGA 

The match between the Tl sequence and the C1/C2 sequence is 

15 ATTTCCGGCAAATCGGCAAATTGCCGGAATTGAA 

The match between the T2 sequence and the C1/C2 sequence is 

TGAACATTTCCGGCAAATCGGCAAATTGC 

20 



A double stranded DNA loop of length 98.488 kilo-bases on chromosome 3 is 
bounded on the left by a Tl sequence whose identifier is 15365. This Tl control 
25 element has the DNA sequence 

AAAATTTCCGGCAAATCGGCAATTTGCCAAAAATTGAAATTTCCGGCAAA 
TCGGCAATTTGTCAAAAATGAAAATTTCCGGCAAATCGGCAAATTGCCGA 
AAATGAAAATTTCCGGCAAATCGGCAAACTTCCGGAACTGAAAATTTCCG 
3 0 GCAAATCGGCAATTTGCCATAAATGAACATTTCCGG. . .GGCG A AAATTAAA 

ATTTCCGCCATATCGGCAATTTGCCAAAAAATTAAAATTTCCGGCAAATC 
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GGCAAATTGCCGGAATTCAAAATTTCCGGCAAACCGGCAAATTGCCGGAA 
CTCAAAATTCCCGGCAAATCAGCAAATTGCCGGAATT 

This double stranded DNA loop is bounded on the right by a T2 control element 
whose identifier is 15627. This T2 control element has the DNA sequence 

TGGCAAACCGGCAAATTGCCGGAATTGAACATTTCCGGCAAATCGGCAAT 
TTGCCGGAATTGAAATTT 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the expression of the 
following C1/C2 short loops 

A C1/C2 short loop on chromosome 3 whose identifier is 15366 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 

TGCCGATTTGCCGGAAATTTTCATTTTCGGCAATTTGCCGATTTGCCGGAA 
ATTTTCATT 

This T1-T2 loop also modulates the C1/C2 short loops numbered 15366 to 15624 

A C1/C2 short loop on chromosome 3 whose identifier is 15625 controls the 
expression of the genes of one or more other T1/T2 long loops. This C1/C2 short 
loop has the DNA sequence 

TCAAGCAAATTGTCAAATTCGCGGAACTAAACATTTCCGGCAAATCGGCA 
AATT 

The expression of genes in this T1/T2 long loop is controlled by the following C1/C2 
short loops. 
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A C1/C2 short loop on chromosome 4 whose identifier is 16760 controls the 
expression of the genes in this T1/T2 long loop. This C1/C2 short loop is expressed 
as a RNA single strand that is 3'UTR to the gene T23E1 .2 and has the DNA sequence 

GGCAAATTGCCGAAATTGAACATTTCCGGCAAATCGGCAAATTGCCGGAA 

TTGAACATTTCCGGCAAATCGGCAAATTGCCGGAATTGAACATTTCCGGC 

AAATCGGCAAATTGCCGGAATTGA 

The match between the Tl sequence and the C1/C2 sequence is 

ATTTCCGGCAAATCGGCAAATTGCCGGAATT 

The match between the T2 sequence and the C1/C2 sequence is 

CGGCAAATTGCCGGAATTGAACATTTCCGGCAAATCGGCAA 
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